I Dermatology Clinic, Seoul, Korea.
Department of Dermatology, Severance Hospital, Yonsei University College of Medicine, Seoul, Korea.
JAMA Dermatol. 2020 Jan 1;156(1):29-37. doi: 10.1001/jamadermatol.2019.3807.
Detection of cutaneous cancer on the face using deep-learning algorithms has been challenging because various anatomic structures create curves and shades that confuse the algorithm and can potentially lead to false-positive results.
To evaluate whether an algorithm can automatically locate suspected areas and predict the probability of a lesion being malignant.
DESIGN, SETTING, AND PARTICIPANTS: Region-based convolutional neural network technology was used to create 924 538 possible lesions by extracting nodular benign lesions from 182 348 clinical photographs. After manually or automatically annotating these possible lesions based on image findings, convolutional neural networks were trained with 1 106 886 image crops to locate and diagnose cancer. Validation data sets (2844 images from 673 patients; mean [SD] age, 58.2 [19.9] years; 308 men [45.8%]; 185 patients with malignant tumors, 305 with benign tumors, and 183 free of tumor) were obtained from 3 hospitals between January 1, 2010, and September 30, 2018.
The area under the receiver operating characteristic curve, F1 score (mean of precision and recall; range, 0.000-1.000), and Youden index score (sensitivity + specificity -1; 0%-100%) were used to compare the performance of the algorithm with that of the participants.
The algorithm analyzed a mean (SD) of 4.2 (2.4) photographs per patient and reported the malignancy score according to the highest malignancy output. The area under the receiver operating characteristic curve for the validation data set (673 patients) was 0.910. At a high-sensitivity cutoff threshold, the sensitivity and specificity of the model with the 673 patients were 76.8% and 90.6%, respectively. With the test partition (325 images; 80 patients), the performance of the algorithm was compared with the performance of 13 board-certified dermatologists, 34 dermatology residents, 20 nondermatologic physicians, and 52 members of the general public with no medical background. When the disease screening performance was evaluated at high sensitivity areas using the F1 score and Youden index score, the algorithm showed a higher F1 score (0.831 vs 0.653 [0.126], P < .001) and Youden index score (0.675 vs 0.417 [0.124], P < .001) than that of nondermatologic physicians. The accuracy of the algorithm was comparable with that of dermatologists (F1 score, 0.831 vs 0.835 [0.040]; Youden index score, 0.675 vs 0.671 [0.100]).
The results of the study suggest that the algorithm could localize and diagnose skin cancer without preselection of suspicious lesions by dermatologists.
使用深度学习算法对面部皮肤癌的检测具有挑战性,因为各种解剖结构会产生曲线和阴影,从而使算法混淆,并可能导致假阳性结果。
评估算法是否能够自动定位可疑区域并预测病变恶性的概率。
设计、地点和参与者:使用基于区域的卷积神经网络技术,通过从 182348 张临床照片中提取结节性良性病变,创建了 924538 个可能的病变。在根据图像发现手动或自动注释这些可能的病变后,使用 1106886 个图像裁剪来训练卷积神经网络以定位和诊断癌症。验证数据集(来自 3 家医院的 673 名患者的 2844 张图像;平均[SD]年龄 58.2[19.9]岁;308 名男性[45.8%];185 名患者患有恶性肿瘤,305 名患有良性肿瘤,183 名无肿瘤)是在 2010 年 1 月 1 日至 2018 年 9 月 30 日之间从 3 家医院获得的。
使用受试者工作特征曲线下面积、F1 评分(精确率和召回率的平均值;范围为 0.000-1.000)和约登指数评分(敏感性+特异性-1;0%-100%)来比较算法的性能与参与者的性能。
该算法平均(SD)分析每位患者的 4.2(2.4)张照片,并根据最高恶性程度输出报告恶性程度评分。验证数据集中(673 名患者)的受试者工作特征曲线下面积为 0.910。在高灵敏度截止阈值下,该模型在 673 名患者中的敏感性和特异性分别为 76.8%和 90.6%。在测试分区(325 张图像;80 名患者)中,将算法的性能与 13 名董事会认证的皮肤科医生、34 名皮肤科住院医师、20 名非皮肤科医生和 52 名没有医学背景的普通公众的表现进行了比较。当使用 F1 评分和约登指数评分评估高灵敏度区域的疾病筛查性能时,算法显示出更高的 F1 评分(0.831 比 0.653[0.126],P<0.001)和更高的约登指数评分(0.675 比 0.417[0.124],P<0.001)比非皮肤科医生。算法的准确性与皮肤科医生相当(F1 评分,0.831 比 0.835[0.040];约登指数评分,0.675 比 0.671[0.100])。
研究结果表明,该算法无需皮肤科医生预先选择可疑病变即可定位和诊断皮肤癌。