I Dermatology Clinic, Seoul, Korea.
Department of Radiology, Chonnam National University Medical School and Hospital, Gwangju, Korea.
J Invest Dermatol. 2020 Sep;140(9):1753-1761. doi: 10.1016/j.jid.2020.01.019. Epub 2020 Mar 31.
Although deep learning algorithms have demonstrated expert-level performance, previous efforts were mostly binary classifications of limited disorders. We trained an algorithm with 220,680 images of 174 disorders and validated it using Edinburgh (1,300 images; 10 disorders) and SNU datasets (2,201 images; 134 disorders). The algorithm could accurately predict malignancy, suggest primary treatment options, render multi-class classification among 134 disorders, and improve the performance of medical professionals. The area under the curves for malignancy detection were 0.928 ± 0.002 (Edinburgh) and 0.937 ± 0.004 (SNU). The area under the curves of primary treatment suggestion (SNU) were 0.828 ± 0.012, 0.885 ± 0.006, 0.885 ± 0.006, and 0.918 ± 0.006 for steroids, antibiotics, antivirals, and antifungals, respectively. For multi-class classification, the mean top-1 and top-5 accuracies were 56.7 ± 1.6% and 92.0 ± 1.1% (Edinburgh) and 44.8 ± 1.2% and 78.1 ± 0.3% (SNU), respectively. With the assistance of our algorithm, the sensitivity and specificity of 47 clinicians (21 dermatologists and 26 dermatology residents) for malignancy prediction (SNU; 240 images) were improved by 12.1% (P < 0.0001) and 1.1% (P < 0.0001), respectively. The malignancy prediction sensitivity of 23 non-medical professionals was significantly increased by 83.8% (P < 0.0001). The top-1 and top-3 accuracies of four doctors in the multi-class classification of 134 diseases (SNU; 2,201 images) were increased by 7.0% (P = 0.045) and 10.1% (P = 0.0020), respectively. The results suggest that our algorithm may serve as augmented intelligence that can empower medical professionals in diagnostic dermatology.
虽然深度学习算法已经展示了专家级的性能,但之前的工作主要是对有限的疾病进行二进制分类。我们使用 220680 张 174 种疾病的图像训练了一个算法,并使用爱丁堡(1300 张图像;10 种疾病)和 SNU 数据集(2201 张图像;134 种疾病)进行了验证。该算法可以准确预测恶性肿瘤,提出主要治疗方案,对 134 种疾病进行多类分类,并提高医学专业人员的性能。恶性肿瘤检测的曲线下面积分别为 0.928±0.002(爱丁堡)和 0.937±0.004(SNU)。主要治疗建议(SNU)的曲线下面积分别为 0.828±0.012、0.885±0.006、0.885±0.006 和 0.918±0.006,分别用于类固醇、抗生素、抗病毒药物和抗真菌药物。对于多类分类,爱丁堡的平均 top-1 和 top-5 准确率分别为 56.7±1.6%和 92.0±1.1%,SNU 分别为 44.8±1.2%和 78.1±0.3%。在我们算法的辅助下,47 位临床医生(21 位皮肤科医生和 26 位皮肤科住院医生)对恶性肿瘤预测(SNU;240 张图像)的敏感性和特异性分别提高了 12.1%(P<0.0001)和 1.1%(P<0.0001)。23 位非医学专业人员的恶性肿瘤预测敏感性显著提高了 83.8%(P<0.0001)。在对 134 种疾病的多类分类(SNU;2201 张图像)中,四位医生的 top-1 和 top-3 准确率分别提高了 7.0%(P=0.045)和 10.1%(P=0.0020)。结果表明,我们的算法可以作为增强智能,为皮肤科医生的诊断提供帮助。