OpenAI o1 model outperforms human doctors in ER diagnoses
Key takeaways
- A recent study in Science shows that LLMs can now outperform doctors in diagnosing.
- The model was tasked with various activities, such as analyzing medical profiles, suggesting diagnoses, determining next steps, and estimating the likelihood of future health changes.
- The LLM achieved a perfect clinical reasoning score 98% of the time.
Why this matters: local context for readers following news across Pakistan and the region.
Add ARY News on Google. A recent study in Science shows that LLMs can now outperform doctors in diagnosing. The research found that Open AI’s o1 model correctly or nearly correctly identified diagnoses in 67% of early ER cases, compared to roughly 50-55% for physicians.
The model was tasked with various activities, such as analyzing medical profiles, suggesting diagnoses, determining next steps, and estimating the likelihood of future health changes. In all these tasks, it performed on par with or better than physicians.
The LLM achieved a perfect clinical reasoning score 98% of the time. For one task, o1 received a perfect score in 98% of cases for how well it explained diagnostic reasoning and proposed next steps, whereas physicians scored similarly in only 35% of cases. This indicates the model might be more consistent in documenting and articulating medical logic, especially under stress.