Wu Huaiyu, Ye Xiuqin, Jiang Yitao, Tian Hongtian, Yang Keen, Cui Chen, Shi Siyuan, Liu Yan, Huang Sijing, Chen Jing, Xu Jinfeng, Dong Fajin
Department of Ultrasound, First Clinical College of Jinan University, Second Clinical College of Jinan University, First Affiliated Hospital of Southern University of Science and Technology, Shenzhen People's Hospital, Shenzhen, China.
Research and Development Department, Microport Prophecy, Shanghai, China.
Front Oncol. 2022 Jul 7;12:869421. doi: 10.3389/fonc.2022.869421. eCollection 2022.
The purpose of this study was to explore the performance of different parameter combinations of deep learning (DL) models (Xception, DenseNet121, MobileNet, ResNet50 and EfficientNetB0) and input image resolutions (REZs) (224 × 224, 320 × 320 and 488 × 488 pixels) for breast cancer diagnosis.
This multicenter study retrospectively studied gray-scale ultrasound breast images enrolled from two Chinese hospitals. The data are divided into training, validation, internal testing and external testing set. Three-hundreds images were randomly selected for the physician-AI comparison. The Wilcoxon test was used to compare the diagnose error of physicians and models under =0.05 and 0.10 significance level. The specificity, sensitivity, accuracy, area under the curve (AUC) were used as primary evaluation metrics.
A total of 13,684 images of 3447 female patients are finally included. In external test the 224 and 320 REZ achieve the best performance in MobileNet and EfficientNetB0 respectively (AUC: 0.893 and 0.907). Meanwhile, 448 REZ achieve the best performance in Xception, DenseNet121 and ResNet50 (AUC: 0.900, 0.883 and 0.871 respectively). In physician-AI test set, the 320 REZ for EfficientNetB0 (AUC: 0.896, < 0.1) is better than senior physicians. Besides, the 224 REZ for MobileNet (AUC: 0.878, < 0.1), 448 REZ for Xception (AUC: 0.895, < 0.1) are better than junior physicians. While the 448 REZ for DenseNet121 (AUC: 0.880, < 0.05) and ResNet50 (AUC: 0.838, < 0.05) are only better than entry physicians.
Based on the gray-scale ultrasound breast images, we obtained the best DL combination which was better than the physicians.
本研究旨在探讨深度学习(DL)模型(Xception、DenseNet121、MobileNet、ResNet50和EfficientNetB0)的不同参数组合以及输入图像分辨率(REZs)(224×224、320×320和488×488像素)在乳腺癌诊断中的性能。
这项多中心研究回顾性地研究了从两家中国医院收集的乳腺灰度超声图像。数据被分为训练集、验证集、内部测试集和外部测试集。随机选择300张图像用于医生与人工智能的比较。采用Wilcoxon检验在显著性水平α=0.05和0.10下比较医生和模型的诊断误差。特异性、敏感性、准确性、曲线下面积(AUC)用作主要评估指标。
最终纳入了3447名女性患者的13684张图像。在外部测试中,224和320的REZ分别在MobileNet和EfficientNetB0中表现最佳(AUC:0.893和0.