Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Construction of a Prediction Model for Thyroid Nodules Based on Machine Learning
0
Zitationen
5
Autoren
2026
Jahr
Abstract
The increasing incidence of papillary thyroid carcinoma (PTC) has intensified debates over its management. The lack of reliable methods for distinguishing indolent from aggressive cases fuels persistent advocacy for surgical intervention [1, 2]. This controversy underscores a critical gap: the absence of robust preoperative tools to evaluate prognostic factors such as malignancy risk and lymph node metastasis. With the development of technology, artificial intelligence (AI) has played a significant role in medical practice, including predicting trajectories of patients and reducing the risk of misdiagnosis [3-5]. The current clinical TI-RADS grading standard classifies category 4 thyroid nodules into subtypes with varying malignancy rates. The clinical value of predicting the benign or malignant status of nodules within this range remains substantial. In this study, we employed machine learning techniques to integrate TI-RADS and clinical indicators to predict the benign or malignant status of TI-RADS category 4 thyroid nodules. Among the models, RF demonstrated superior performance in terms of several metrics, particularly in terms of the overall prediction accuracy. In this study, we selected a detailed dataset including 1969 patients with thyroid nodules who visited a hospital in Shanghai Ruijin Rehabilitation Hospital. Among these patients, 764 were classified as having TI-RADS category 4 glandular nodules. Rigorous postoperative pathological evaluations were conducted for all participants, and their pathological diagnosis results, along with the relevant clinical indicators, were integrated into the data analysis set. The average age of the participants was 45.24 years, ranging from 14 to 79 years. In terms of sex composition, 23.69% of the patients were male, totalling 181 cases, while 76.31% were female, totalling 583 cases. All of the thyroid ultrasound images were evaluated by a physician specializing in clinical ultrasound. The evaluation revealed 379 (49.61%) category 4 A nodules, 297 (38.87%) category 4B nodules, and 88 (11.52%) category 4 C nodules, and the final pathological results revealed 448 malignant cases (58.64%) and 316 benign cases (41.36%). On the basis of the provided description, we evaluated the performance of different machine learning models and their respective hyperparameters, identifying the highest accuracy achieved by each model (Table S1). We assessed the diagnostic performance of these models on the validation set by using a fixed random number seed (Table 1), and we quantified the diagnostic performance metrics of the four models across individual TI-RADS subcategories and performed comparisons against the benchmark malignancy probabilities defined by the TI-RADS classification system (Table S2). Among the models, RF demonstrated superior performance in terms of several metrics, particularly in terms of the overall prediction accuracy, which was measured at 0.7016. RF also achieved the highest AUC value of 0.7593, indicating its strong ability to distinguish between benign and malignant nodules. These results suggest that the RF model offers a comprehensive capability to accurately identify the nature of nodules. Although the SVM did not excel in terms of the overall accuracy, it exhibited a remarkably high sensitivity of 0.8648. This high sensitivity implies that the SVM has a strong ability to correctly identify malignant nodules, resulting in a high true-positive rate. Consequently, SVMs were effective in terms of detecting the majority of malignant cases. On the other hand, the MLP performed exceptionally well in terms of the PPV, reaching a value of 0.7164. This finding indicates that the model has a relatively high accuracy when identifying nodules as benign and is proficient in correctly classifying truly benign nodules. Compared with the TI-RADS classification, the machine learning models demonstrated superior diagnostic efficacy when analysing nodules across 4 subcategories. Notably, for the 4A and 4B nodules, all four models achieved accuracies exceeding the malignancy probability thresholds defined by the TI-RADS criteria. While the models maintained diagnostic reliability for 4C nodules, the limited sample size of this subclass in our cohort needs validation in larger multicentre datasets to confirm generalizability. Importantly, clinical management protocols prioritize aggressive intervention for most 4C nodules due to their high probability of malignancy, thereby reducing the clinical utility of predictive modelling for this subclass. Consequently, the enhanced predictive ability of machine learning models for 4A/B nodules has greater translational value, potentially reducing unnecessary invasive procedures. In the comparative analysis of machine learning models for thyroid nodule malignancy prediction, advantages emerge on the basis of the diagnostic priorities. RF is suitable for achieving an optimal balance between the overall accuracy and AUC; the SVM prioritizes the capture of as many malignant cases as possible, emphasizing its high sensitivity; and the MLP is more reliable when predicting positive samples. Therefore, the selection of an appropriate model should be based on clinical requirements and the specific focus of the application. This study has several limitations. The retrospective design and the limited sample size may introduce selection bias, and the single-center data are another constraint. Also, we agree that multi-center validation is essential; However, practical challenges regarding data sharing and consent currently place this step outside the immediate scope of our proof-of-concept study. Therefore, larger-scale prospective multicentre clinical studies are needed to verify the validity and applicability of the models. In this study, we employed machine learning techniques to integrate TI-RADS and clinical indicators to predict the benign or malignant status of TI-RADS category 4 thyroid nodules. Among the models, RF demonstrated superior performance in terms of several metrics, which offers a comprehensive capability to accurately identify the nature of nodules. While, rather than viewing these models in isolation, we propose that their complementary strengths in sensitivity and specificity can be leveraged through integration or a staged decision-making framework, thereby enhancing overall clinical utility beyond what any single model could achieve. Siye Gong: data curation (lead), formal analysis (lead), investigation (lead), methodology (lead), software(lead), writing – original draft (lead), writing – review and editing (equal). Zhicheng Wang: data curation (lead), formal analysis (lead), investigation (lead), methodology (lead), software(lead), writing – original draft (lead), writing – review and editing (equal). Yulin Xu: data curation (equal), formal analysis (equal), investigation equal), methodology (equal), software(equal), writing – original draft (equal). Rongli Xie: conceptualization (lead), data curation (equal), formal analysis (equal), funding acquisition (lead), investigation (equal), methodology (equal), project administration (lead), supervision (lead), validation (equal), writing – original draft (lead), writing – review and editing (lead). Jian Fei: conceptualization (lead), data curation (lead), formal analysis (equal), funding acquisition (lead), investigation (equal), methodology (lead), project administration (lead), resources (lead), supervision (lead), validation (lead), writing – original draft (lead), writing – review and editing (lead). All authors have read and approved the final manuscript. We gratefully acknowledge the valuable support provided by colleagues from Ruijin Rehabilitation Hospital and Ruijin Hospital. This work was supported by Shanghai Municipal Health Commission (Grant No. QJZXYJK-202401) and Shanghai Huangpu District Health Commission (Grant No. 2023GG01, 2023XD02). This retrospective study was approved by the Ethics Committee of Shanghai Ruijin Hospital Luwan Branch (LWEC2022009). All procedures were performed according to the principles of the Declaration of Helsinki. Since this was a retrospective study and anonymized data were evaluated, patient consent was waived by our institutional ethics committee. The authors declare no conflicts of interest. The datasets analysed during the current study available from the corresponding author on reasonable request. Please note: The publisher is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.
Ähnliche Arbeiten
2015 American Thyroid Association Management Guidelines for Adult Patients with Thyroid Nodules and Differentiated Thyroid Cancer: The American Thyroid Association Guidelines Task Force on Thyroid Nodules and Differentiated Thyroid Cancer
2015 · 16.176 Zit.
Revised American Thyroid Association Management Guidelines for Patients with Thyroid Nodules and Differentiated Thyroid Cancer
2009 · 6.731 Zit.
Serum TSH, T<sub>4</sub>, and Thyroid Antibodies in the United States Population (1988 to 1994): National Health and Nutrition Examination Survey (NHANES III)
2002 · 3.850 Zit.
Increasing Incidence of Thyroid Cancer in the United States, 1973-2002
2006 · 3.353 Zit.
Integrated Genomic Characterization of Papillary Thyroid Carcinoma
2014 · 3.027 Zit.