Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Distilling large language models for matching patients to clinical trials

2024·59 Zitationen·Journal of the American Medical Informatics AssociationOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2024

Jahr

Abstract

OBJECTIVE: The objective of this study is to systematically examine the efficacy of both proprietary (GPT-3.5, GPT-4) and open-source large language models (LLMs) (LLAMA 7B, 13B, 70B) in the context of matching patients to clinical trials in healthcare. MATERIALS AND METHODS: The study employs a multifaceted evaluation framework, incorporating extensive automated and human-centric assessments along with a detailed error analysis for each model, and assesses LLMs' capabilities in analyzing patient eligibility against clinical trial's inclusion and exclusion criteria. To improve the adaptability of open-source LLMs, a specialized synthetic dataset was created using GPT-4, facilitating effective fine-tuning under constrained data conditions. RESULTS: The findings indicate that open-source LLMs, when fine-tuned on this limited and synthetic dataset, achieve performance parity with their proprietary counterparts, such as GPT-3.5. DISCUSSION: This study highlights the recent success of LLMs in the high-stakes domain of healthcare, specifically in patient-trial matching. The research demonstrates the potential of open-source models to match the performance of proprietary models when fine-tuned appropriately, addressing challenges like cost, privacy, and reproducibility concerns associated with closed-source proprietary LLMs. CONCLUSION: The study underscores the opportunity for open-source LLMs in patient-trial matching. To encourage further research and applications in this field, the annotated evaluation dataset and the fine-tuned LLM, Trial-LLAMA, are released for public use.

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationMachine Learning in HealthcareElectronic Health Records Systems

Volltext beim Verlag öffnen

Distilling large language models for matching patients to clinical trials

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen