Fragility of Language Models to Drug Name Variants
Type | research |
---|---|
Area | AIMedical |
Published(YearMonth) | 2406 |
Source | https://arxiv.org/abs/2406.12066 |
Tag | newsletter |
Checkbox | |
Date(of entry) |
This paper reveals that large language models (LLMs) exhibit significant fragility when confronted with variations in drug names, such as brand versus generic names, within biomedical benchmarks. The study introduces the RABBITS dataset to evaluate this issue, finding a 1-10% performance drop in medical QA tasks when drug names are swapped. This fragility is attributed to contamination in pre-training data, raising concerns about the robustness of LLMs in critical medical applications.