OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 27.05.2026, 07:28

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Deriving the OTA/AO fracture classification from routinely collected radiology reports using a large language model

2026·0 Zitationen·OTA International The Open Access Journal of Orthopaedic TraumaOpen Access
Volltext beim Verlag öffnen

0

Zitationen

11

Autoren

2026

Jahr

Abstract

Objectives: Fracture classification plays a pivotal role in research and quality assurance; despite its wide acceptance, the OTA/AO classification is seldom documented in patients' electronic medical records, which impedes fracture registry creation and effective interdisciplinary communication. In this study, we investigate "off-the-shelf" large language models (LLMs) in translating free text in radiology reports into OTA/AO classification labels. Methods: We employed a Health Insurance Portability and Accountability Act-compliant LLM to classify 109 fracture descriptions from randomly selected radiology reports in a deidentified electronic medical record database. Ground-truth classifications were assigned by expert orthopaedic traumatologists based on corresponding radiographs. Multiple prompting strategies were tested, including zero-shot prompting, zero-shot chain-of-thought prompting, and retrieval-augmented generation. We additionally asked the LLM to assign classification labels to "ideal" fracture descriptions written according to the 2018 OTA/AO Fracture and Dislocation Classification Compendium. Model performance was assessed using Cohen kappa and accuracy against ground-truth labels. Results: levels. Performance declined to slight agreement at the subgroup level. The best performance was observed using ideal fracture descriptions with retrieval-augmented generation, in which the agreement between the full LLM-generated and ground-truth labels remained moderate. Classification errors were largely due to imprecise descriptions, hallucinated information, or incorrect application of factually correct information. Conclusions: Our study demonstrates some potential for LLMs to translate free-text fracture descriptions into OTA/AO classifications, allowing for efficient labeling of large datasets of radiology reports. Future work should focus on refining model classification capabilities using more sophisticated prompting methods. Level of Evidence: Level III.

Ähnliche Arbeiten