1 Department of Radiology, The First Affiliated Hospital of Jinan University, Guangzhou, Guangdong, P.R. China.
2 Key Laboratory of Molecular Imaging, Chinese Academy of Science, Beijing, P.R. China.
Thyroid. 2019 Jun;29(6):858-867. doi: 10.1089/thy.2018.0380. Epub 2019 Apr 27.
Ultrasound (US) examination is helpful in the differential diagnosis of thyroid nodules (malignant vs. benign), but its accuracy relies heavily on examiner experience. Therefore, the aim of this study was to develop a less subjective diagnostic model aided by machine learning. A total of 2064 thyroid nodules (2032 patients, 695 male; = 45.25 ± 13.49 years) met all of the following inclusion criteria: (i) hemi- or total thyroidectomy, (ii) maximum nodule diameter 2.5 cm, (iii) examination by conventional US and real-time elastography within one month before surgery, and (iv) no previous thyroid surgery or percutaneous thermotherapy. Models were developed using 60% of randomly selected samples based on nine commonly used algorithms, and validated using the remaining 40% of cases. All models function with a validation data set that has a pretest probability of malignancy of 10%. The models were refined with machine learning that consisted of 1000 repetitions of derivatization and validation, and compared to diagnosis by an experienced radiologist. Sensitivity, specificity, accuracy, and area under the curve (AUC) were calculated. A random forest algorithm led to the best diagnostic model, which performed better than radiologist diagnosis based on conventional US only (AUC = 0.924 [confidence interval (CI) 0.895-0.953] vs. 0.834 [CI 0.815-0.853]) and based on both conventional US and real-time elastography (AUC = 0.938 [CI 0.914-0.961] vs. 0.843 [CI 0.829-0.857]). Machine-learning algorithms based on US examinations, particularly the random forest classifier, may diagnose malignant thyroid nodules better than radiologists.
超声(US)检查有助于甲状腺结节(恶性与良性)的鉴别诊断,但准确性很大程度上依赖于检查者的经验。因此,本研究旨在开发一种由机器学习辅助的、较少主观的诊断模型。
共有 2064 个甲状腺结节(2032 例患者,男性 695 例;年龄为 45.25 ± 13.49 岁)符合以下所有纳入标准:(i)半甲状腺或全甲状腺切除术,(ii)最大结节直径 2.5 cm,(iii)术前一个月内行常规 US 和实时弹性成像检查,(iv)无既往甲状腺手术或经皮热疗史。使用随机选择的样本的 60%,基于 9 种常用算法开发模型,并使用剩余的 40%病例进行验证。所有模型均使用恶性预测试验概率为 10%的验证数据集进行验证。模型通过包含 1000 次推导和验证的机器学习进行优化,并与经验丰富的放射科医生的诊断进行比较。计算了敏感性、特异性、准确性和曲线下面积(AUC)。随机森林算法产生了最佳的诊断模型,其性能优于仅基于常规 US 的放射科医生诊断(AUC=0.924[95%置信区间(CI)0.895-0.953]与 0.834[CI 0.815-0.853])和基于常规 US 和实时弹性成像的诊断(AUC=0.938[95%CI 0.914-0.961]与 0.843[CI 0.829-0.857])。基于 US 检查的机器学习算法,特别是随机森林分类器,可能比放射科医生更好地诊断恶性甲状腺结节。