From the Department of Obstetrics and Gynecology (H.C., B.W.Y., L.Q., X.H., M.J.J., Q.W.D., W.W.F.) and Department of Pathology (F.Y.), Ruijin Hospital, Shanghai Jiaotong University School of Medicine, 197 Ruijin 2nd Road, Huangpu District, Shanghai 200025, China; and Philips Research Asia Shanghai, Shanghai, China (Y.S.M., X.H.B., X.W.H.).
Radiology. 2022 Jul;304(1):106-113. doi: 10.1148/radiol.211367. Epub 2022 Apr 12.
Background Deep learning (DL) algorithms could improve the classification of ovarian tumors assessed with multimodal US. Purpose To develop DL algorithms for the automated classification of benign versus malignant ovarian tumors assessed with US and to compare algorithm performance to Ovarian-Adnexal Reporting and Data System (O-RADS) and subjective expert assessment for malignancy. Materials and Methods This retrospective study included consecutive women with ovarian tumors undergoing gray scale and color Doppler US from January 2019 to November 2019. Histopathologic analysis was the reference standard. The data set was divided into training (70%), validation (10%), and test (20%) sets. Algorithms modified from residual network (ResNet) with two fusion strategies (feature fusion [hereafter, DL] or decision fusion [hereafter, DL]) were developed. DL prediction of malignancy was compared with O-RADS risk categorization and expert assessment by area under the receiver operating characteristic curve (AUC) analysis in the test set. Results A total of 422 women (mean age, 46.4 years ± 14.8 [SD]) with 304 benign and 118 malignant tumors were included; there were 337 women in the training and validation data set and 85 women in the test data set. DL had an AUC of 0.93 (95% CI: 0.85, 0.97) for classifying malignant from benign ovarian tumors, comparable with O-RADS (AUC, 0.92; 95% CI: 0.85, 0.97; = .88) and expert assessment (AUC, 0.97; 95% CI: 0.91, 0.99; = .07), and similar to DL (AUC, 0.90; 95% CI: 0.82, 0.96; = .29). DL, DL, O-RADS, and expert assessment achieved sensitivities of 92%, 92%, 92%, and 96%, respectively, and specificities of 80%, 85%, 89%, and 87%, respectively, for malignancy. Conclusion Deep learning algorithms developed by using multimodal US images may distinguish malignant from benign ovarian tumors with diagnostic performance comparable to expert subjective and Ovarian-Adnexal Reporting and Data System assessment. © RSNA, 2022 .
背景 深度学习(DL)算法可以提高多模态超声评估的卵巢肿瘤分类的准确性。目的 开发用于自动分类良性与恶性卵巢肿瘤的 DL 算法,并比较该算法与卵巢附件报告和数据系统(O-RADS)以及主观专家评估的恶性肿瘤之间的性能。材料与方法 本回顾性研究纳入了 2019 年 1 月至 2019 年 11 月期间因卵巢肿瘤接受灰阶和彩色多普勒超声检查的连续女性患者。组织病理学分析为参考标准。数据集分为训练集(70%)、验证集(10%)和测试集(20%)。开发了两种从残差网络(ResNet)修改的算法,具有两种融合策略(特征融合[简称 DL]或决策融合[简称 DL])。在测试集中,通过受试者工作特征曲线(AUC)分析比较了 DL 对恶性肿瘤的预测、O-RADS 风险分类和专家评估的性能。结果 共纳入 422 名女性患者(平均年龄 46.4 岁±14.8[标准差]),其中 304 例为良性肿瘤,118 例为恶性肿瘤;训练和验证数据集有 337 名女性患者,测试数据集有 85 名女性患者。DL 对良、恶性卵巢肿瘤的分类效能的 AUC 为 0.93(95%CI:0.85,0.97),与 O-RADS(AUC:0.92;95%CI:0.85,0.97;=0.88)和专家评估(AUC:0.97;95%CI:0.91,0.99;=0.07)相似,与 DL(AUC:0.90;95%CI:0.82,0.96;=0.29)也相似。DL、DL、O-RADS 和专家评估对恶性肿瘤的敏感度分别为 92%、92%、92%和 96%,特异度分别为 80%、85%、89%和 87%。结论 基于多模态超声图像开发的 DL 算法可以区分良、恶性卵巢肿瘤,其诊断性能与专家主观评估和 O-RADS 评估相似。©2022 RSNA。