Back to all articles

Large Language Models Enhance Disease Detection in Electronic Health Records: A Focus on Crystal Arthropathies

RMD open
Read Full Paper

In a groundbreaking development from Switzerland, researchers have harnessed the capabilities of large language models (LLMs) to improve disease detection in electronic health records (EHRs). This innovative study, conducted at Geneva University Hospitals, investigates the effectiveness of Meta's Llama-3-8B model in accurately identifying crystal arthropathies, such as gout and Calcium Pyrophosphate Deposition Disease (CPPD), from French-language EHRs.


Key Findings

  • The LLM framework achieved 95.4% accuracy in detecting gout from EHR documents.
  • It demonstrated 94.1% accuracy in identifying CPPD.
  • It outperformed traditional regex methods in both positive and negative predictive values.
  • The model exhibited robustness across a variety of parameter settings, showcasing its adaptability.

"LLMs accurately detected disease diagnoses from EHRs, even in non-English languages. They could facilitate the creation of large disease registers in any language, enhancing disease care assessment and patient recruitment for clinical trials," - Study Authors.

Why It Matters

The implications of this study reach beyond Geneva University Hospitals. Accurate disease detection from EHRs is essential for improving patient care and optimizing clinical trials. By automating this process, healthcare providers can devote more time to patient care, while researchers gain access to larger, more reliable datasets for analysis.

Additionally, the study underscores the growing significance of artificial intelligence in healthcare, particularly in non-English speaking regions. This advancement could foster more inclusive global health initiatives, ensuring that language barriers do not hinder medical progress.


Research Details

Conducted by a team of interdisciplinary experts in Geneva, the study utilized a training and testing set comprising 700 paragraphs focused on 'gout'—a term with multiple meanings in French. Researchers manually classified these paragraphs into disease (true gout) and non-disease categories, establishing a gold standard for evaluating the LLM's performance.

The study also included a validation phase with 600 paragraphs related to CPPD. The LLM's accuracy was assessed using advanced prompting techniques and compared against a regex-based method.

The data was sourced from the Geneva University Hospital's EHRs, a rich repository serving a diverse population. This allowed researchers to evaluate the framework's effectiveness in real-world conditions, accounting for the complexities of medical language.

"The LLM-based algorithm outperformed the regex method, achieving a 92.7% positive predictive value, a 96.6% negative predictive value, and an accuracy of 95.4% for gout," - Research Team.

Looking Ahead

The success of this study opens new pathways for automating disease detection in EHRs across various languages and healthcare systems. The potential to create large, accurate disease registers is particularly promising for clinical trials, where patient recruitment and data accuracy are critical.

Future research could focus on expanding this framework to other diseases and languages, further illustrating the versatility and power of LLMs in healthcare. Additionally, integrating such models into daily clinical workflows could lead to real-time decision support tools for healthcare professionals.

In conclusion, this pioneering study not only highlights the potential of AI in healthcare but also sets a benchmark for future innovations aimed at enhancing patient outcomes through technology.


As the healthcare industry evolves, the integration of AI and machine learning will play a crucial role in shaping the future of medical research and patient care. This study represents a significant step forward in that journey, demonstrating the transformative power of technology in healthcare.

AI in Healthcare