Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Performance and reliability of state-of-the-art LLMs in complex hand surgery scenarios: A prospective cross-sectional, double-blinded study

2026·0 Zitationen·Journal of orthopaedic surgeryOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2026

Jahr

Abstract

< 0.001). Notably, Gemini and Grok demonstrated consistently high performance with minimal variability, while ChatGPT, particularly DeepSeek, exhibited considerable inconsistency in complex clinical judgments.ConclusionGemini 2 and Grok 3 showed reliable and clinically relevant performance, positioning them as promising adjunctive tools for decision-making and education in hand surgery. The limitations in ChatGPT-5 and the significant shortcomings of DeepSeek underscore the necessity for cautious deployment and continued refinement.

Autoren

Ahmet Savran

Institutionen

Izmir Institute of Technology(TR)

Themen

Artificial Intelligence in Healthcare and EducationDiversity and Career in MedicineClinical Reasoning and Diagnostic Skills

Volltext beim Verlag öffnen

Performance and reliability of state-of-the-art LLMs in complex hand surgery scenarios: A prospective cross-sectional, double-blinded study

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen