LLMs Cross the Threshold: GPT-4.5 Passes the Turing Test
Type | research |
---|---|
Area | AI |
Published(YearMonth) | 2503 |
Source | https://arxiv.org/abs/2503.23674 |
Tag | newsletter |
Checkbox | |
Date(of entry) |
In a landmark study, researchers conducted two large-scale, controlled Turing tests to evaluate the human-likeness of four AI systems: ELIZA, GPT-4o, LLaMa-3.1-405B, and GPT-4.5. Participants held simultaneous five-minute conversations with both a human and one AI system, then judged who was human. When prompted with a humanlike persona, GPT-4.5 was identified as the human 73% of the time—surpassing actual human participants and marking the first empirical demonstration of an AI passing a traditional three-party Turing test. LLaMa-3.1 performed comparably to humans (56%), while ELIZA and GPT-4o fared poorly (23% and 21%). These findings ignite fresh discussion around the nature of intelligence in LLMs and their growing societal influence.