Fragility of Language Models to Drug Name Variants

Type	research
Area	AIMedical
Published(YearMonth)	2406
Source	https://arxiv.org/abs/2406.12066
Tag	newsletter
Checkbox
Date(of entry)	@August 4, 2024

This paper reveals that large language models (LLMs) exhibit significant fragility when confronted with variations in drug names, such as brand versus generic names, within biomedical benchmarks. The study introduces the RABBITS dataset to evaluate this issue, finding a 1-10% performance drop in medical QA tasks when drug names are swapped. This fragility is attributed to contamination in pre-training data, raising concerns about the robustness of LLMs in critical medical applications.