Instance attribution (IA) aims to identify the training instances leading to the prediction of a test example, helping researchers understand the dataset better and optimize data processing.
The radioactive nature of Large Language Model (LLM) watermarking enables the detection of watermarks inherited by student models when trained on the outputs of watermarked teacher models, making it a promising tool for preventing…
Miller Fellow at the Miller Institute for Basic Research in Science · Pracovní zkušenosti: The Miller Institute for Basic Research in Science · Vzdělání: Massachusetts Institute of Technology · Lokalita: Spojené státy · 70 spojení na…