Department of Radiology, Ajou University School of Medicine, Wonchon-Dong, Yeongtong-Gu, Suwon, 16499, South Korea.
Department of Radiology, GangNeung Asan Hospital, University of Ulsan College of Medicine, Gangneung-si, Gangwon-do, 25440, South Korea.
Eur Radiol. 2023 May;33(5):3211-3221. doi: 10.1007/s00330-022-09376-0. Epub 2023 Jan 4.
We constructed and validated a machine learning-based malignancy risk estimation model using predefined clinicoradiological features, and evaluated its clinical utility for the management of thyroid nodules.
In total, 5708 benign (n = 4597) and malignant (n = 1111) thyroid nodules were collected from 5081 consecutive patients treated in 26 institutions. Seventeen experienced radiologists evaluated nodule characteristics on ultrasonographic images. Eight predictive models were used to stratify the thyroid nodules according to malignancy risk; model performance was assessed via nested 10-fold cross-validation. The best-performing algorithm was externally validated using data for 454 thyroid nodules from a tertiary hospital, then compared to the Thyroid Imaging Reporting and Data System (TIRADS)-based interpretations of radiologists (American College of Radiology, European and Korean TIRADS, and AACE/ACE/AME guidelines).
The area under the receiver operating characteristic (AUROC) curves of the algorithms ranged from 0.773 to 0.862. The sensitivities, specificities, positive predictive values, and negative predictive values of the best-performing models were 74.1-76.6%, 80.9-83.4%, 49.2-51.9%, and 93.0-93.5%, respectively. For the external validation set, the ElasticNet values were 83.2%, 89.2%, 81.8%, and 90.1%, respectively. The corresponding TIRADS values were 66.5-85.0%, 61.3-80.8%, 45.9-72.1%, and 81.5-90.3%, respectively. The new model exhibited a significantly higher AUROC and specificity than did the TIRADS risk stratification, although its sensitivity was similar.
We developed a reliable machine learning-based predictive model that demonstrated enhanced specificity when stratifying thyroid nodules according to malignancy risk. This system will contribute to improved personalized management of thyroid nodules.
• The area under the receiver operating characteristic (AUROC) curve, sensitivity, and specificity of our model were 0.914, 83.2%, and 89.2%, respectively (derived using the validation dataset). • Compared to the TIRADS values, the AUROC and specificity are significantly higher, while the sensitivity is similar. • An interactive version of our AI algorithm is at http://tirads.cdss.co.kr .
我们构建并验证了一个基于预设临床影像学特征的机器学习恶性风险评估模型,并评估了其在甲状腺结节管理中的临床应用价值。
共收集了 5081 例连续患者的 5708 个良性(n = 4597)和恶性(n = 1111)甲状腺结节。17 名经验丰富的放射科医生在超声图像上评估结节特征。使用八种预测模型根据恶性风险对甲状腺结节进行分层;通过嵌套的 10 折交叉验证评估模型性能。使用来自一家三级医院的 454 个甲状腺结节数据对最佳算法进行外部验证,然后将其与放射科医生的甲状腺成像报告和数据系统(TIRADS)解读(美国放射学院、欧洲和韩国 TIRADS 以及 AACE/ACE/AME 指南)进行比较。
算法的受试者工作特征(ROC)曲线下面积(AUROC)范围为 0.773 至 0.862。最佳模型的灵敏度、特异性、阳性预测值和阴性预测值分别为 74.1-76.6%、80.9-83.4%、49.2-51.9%和 93.0-93.5%。对于外部验证集,弹性网络值分别为 83.2%、89.2%、81.8%和 90.1%。相应的 TIRADS 值分别为 66.5-85.0%、61.3-80.8%、45.9-72.1%和 81.5-90.3%。新模型在分层恶性风险时的 AUROC 和特异性明显高于 TIRADS 风险分层,而其灵敏度相似。
我们开发了一种可靠的基于机器学习的预测模型,在分层甲状腺结节恶性风险时表现出更高的特异性。该系统将有助于改善甲状腺结节的个性化管理。
• 我们模型的受试者工作特征(ROC)曲线下面积(AUROC)、灵敏度和特异性分别为 0.914、83.2%和 89.2%(使用验证数据集得出)。• 与 TIRADS 值相比,AUROC 和特异性显著更高,而灵敏度相似。• 我们的 AI 算法的交互版本可在 http://tirads.cdss.co.kr 上获得。