Dermatology Service, Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, New York.
IBM Research Division, Thomas J. Watson Research Center, Yorktown Heights, New York.
J Am Acad Dermatol. 2018 Feb;78(2):270-277.e1. doi: 10.1016/j.jaad.2017.08.016. Epub 2017 Sep 29.
BACKGROUND: Computer vision may aid in melanoma detection. OBJECTIVE: We sought to compare melanoma diagnostic accuracy of computer algorithms to dermatologists using dermoscopic images. METHODS: We conducted a cross-sectional study using 100 randomly selected dermoscopic images (50 melanomas, 44 nevi, and 6 lentigines) from an international computer vision melanoma challenge dataset (n = 379), along with individual algorithm results from 25 teams. We used 5 methods (nonlearned and machine learning) to combine individual automated predictions into "fusion" algorithms. In a companion study, 8 dermatologists classified the lesions in the 100 images as either benign or malignant. RESULTS: The average sensitivity and specificity of dermatologists in classification was 82% and 59%. At 82% sensitivity, dermatologist specificity was similar to the top challenge algorithm (59% vs. 62%, P = .68) but lower than the best-performing fusion algorithm (59% vs. 76%, P = .02). Receiver operating characteristic area of the top fusion algorithm was greater than the mean receiver operating characteristic area of dermatologists (0.86 vs. 0.71, P = .001). LIMITATIONS: The dataset lacked the full spectrum of skin lesions encountered in clinical practice, particularly banal lesions. Readers and algorithms were not provided clinical data (eg, age or lesion history/symptoms). Results obtained using our study design cannot be extrapolated to clinical practice. CONCLUSION: Deep learning computer vision systems classified melanoma dermoscopy images with accuracy that exceeded some but not all dermatologists.
背景:计算机视觉可能有助于黑色素瘤的检测。
目的:我们旨在比较计算机算法和皮肤科医生使用皮肤镜图像诊断黑色素瘤的准确性。
方法:我们进行了一项横断面研究,使用了来自国际计算机视觉黑色素瘤挑战赛数据集的 100 张随机选择的皮肤镜图像(50 张黑色素瘤、44 张痣和 6 张黑子)(n=379),以及 25 个团队的个别算法结果。我们使用了 5 种方法(非学习和机器学习)将个别自动预测组合成“融合”算法。在一项配套研究中,8 名皮肤科医生将 100 张图像中的病变分类为良性或恶性。
结果:皮肤科医生分类的平均敏感性和特异性分别为 82%和 59%。在 82%的敏感性下,皮肤科医生的特异性与挑战赛的顶级算法相似(59%比 62%,P=0.68),但低于表现最好的融合算法(59%比 76%,P=0.02)。最佳融合算法的接收器操作特征面积大于皮肤科医生的平均接收器操作特征面积(0.86 比 0.71,P=0.001)。
局限性:该数据集缺乏临床实践中遇到的各种皮肤病变,特别是常见病变。读者和算法没有提供临床数据(例如年龄或病变病史/症状)。使用我们的研究设计获得的结果不能外推到临床实践。
结论:深度学习计算机视觉系统对黑色素瘤皮肤镜图像的分类准确性超过了一些但不是所有皮肤科医生。
J Biomed Inform. 2018-8-10
Eur J Cancer. 2019-8-8
Cochrane Database Syst Rev. 2018-12-4
JAMA Dermatol. 2017-4-1
Dermatologie (Heidelb). 2025-8-28
Diagnostics (Basel). 2025-7-31
Life (Basel). 2024-12-4
Cureus. 2024-9-20
J Am Acad Dermatol. 2016-10
J Am Acad Dermatol. 2015-9-19
Arch Dermatol. 2011-2