Borzooei Shiva, Briganti Giovanni, Golparian Mitra, Lechien Jerome R, Tarokhian Aidin
Department of Endocrinology, Faculty of Medicine, Hamadan University of Medical Sciences, Hamadan, Iran.
Chair of AI and Digital Medicine, Faculty of Medicine, University of Mons, Mons, France.
Eur Arch Otorhinolaryngol. 2024 Apr;281(4):2095-2104. doi: 10.1007/s00405-023-08299-w. Epub 2023 Oct 30.
The objective of this study was to train machine learning models for predicting the likelihood of recurrence in patients diagnosed with well-differentiated thyroid cancer. While thyroid cancer mortality remains low, the risk of recurrence is a significant concern. Identifying individual patient recurrence risk is crucial for guiding subsequent management and follow-ups.
In this prospective study, a cohort of 383 patients was observed for a minimum duration of 10 years within a 15-year timeframe. Thirteen clinicopathologic features were assessed to predict recurrence potential. Classic (K-nearest neighbors, support vector machines (SVM), tree-based models) and artificial neural networks (ANN) were trained on three distinct combinations of features: a data set with all features excluding American Thyroid Association (ATA) risk score (12 features), another with ATA risk alone, and a third with all features combined (13 features). 283 patients were allocated for the training process, and 100 patients were reserved for the validation of stage.
The patients' mean age was 40.87 ± 15.13 years, with a majority being female (81%). When using the full data set for training, the models showed the following sensitivity, specificity and AUC, respectively: SVM (99.33%, 97.14%, 99.71), K-nearest neighbors (83%, 97.14%, 98.44), Decision Tree (87%, 100%, 99.35), Random Forest (99.66%, 94.28%, 99.38), ANN (96.6%, 95.71%, 99.64). Eliminating ATA risk data increased models specificity but decreased sensitivity. Conversely, training exclusively on ATA risk data had the opposite effect.
Machine learning models, including classical and neural networks, efficiently stratify the risk of recurrence in patients with well-differentiated thyroid cancer. This can aid in tailoring treatment intensity and determining appropriate follow-up intervals.
本研究的目的是训练机器学习模型,以预测诊断为高分化甲状腺癌患者的复发可能性。虽然甲状腺癌死亡率仍然较低,但复发风险是一个重大问题。识别个体患者的复发风险对于指导后续治疗和随访至关重要。
在这项前瞻性研究中,在15年的时间范围内观察了383例患者,最短观察期为10年。评估了13种临床病理特征以预测复发潜力。经典模型(K近邻、支持向量机(SVM)、基于树的模型)和人工神经网络(ANN)在三种不同的特征组合上进行训练:一个数据集包含除美国甲状腺协会(ATA)风险评分外的所有特征(12个特征),另一个仅包含ATA风险,第三个包含所有特征组合(13个特征)。283例患者被分配用于训练过程,100例患者留作阶段验证。
患者的平均年龄为40.87±15.13岁,大多数为女性(81%)。当使用完整数据集进行训练时,模型分别显示出以下敏感性、特异性和AUC:SVM(99.33%,97.14%,99.71)、K近邻(83%,97.14%,98.44)、决策树(87%,100%,99.35)、随机森林(99.66%94.28%,99.38)、ANN(96.6%95.71%,99.64)。消除ATA风险数据会提高模型的特异性,但会降低敏感性。相反,仅在ATA风险数据上进行训练则会产生相反的效果。
包括经典模型和神经网络在内的机器学习模型能够有效地对高分化甲状腺癌患者的复发风险进行分层。这有助于调整治疗强度并确定适当的随访间隔。