在一项临床黑色素瘤图像分类任务中,经过皮肤镜图像训练的卷积神经网络在性能上可与 145 名皮肤科医生相媲美。

A convolutional neural network trained with dermoscopic images performed on par with 145 dermatologists in a clinical melanoma image classification task.

机构信息

National Center for Tumor Diseases (NCT), German Cancer Research Center (DKFZ), Im Neuenheimer Feld 460, 69120 Heidelberg, Germany; Department of Dermatology, University Hospital Heidelberg, Heidelberg, Germany.

National Center for Tumor Diseases (NCT), German Cancer Research Center (DKFZ), Im Neuenheimer Feld 460, 69120 Heidelberg, Germany.

出版信息

Eur J Cancer. 2019 Apr;111:148-154. doi: 10.1016/j.ejca.2019.02.005. Epub 2019 Mar 8.

Abstract

BACKGROUND

Recent studies have demonstrated the use of convolutional neural networks (CNNs) to classify images of melanoma with accuracies comparable to those achieved by board-certified dermatologists. However, the performance of a CNN exclusively trained with dermoscopic images in a clinical image classification task in direct competition with a large number of dermatologists has not been measured to date. This study compares the performance of a convolutional neuronal network trained with dermoscopic images exclusively for identifying melanoma in clinical photographs with the manual grading of the same images by dermatologists.

METHODS

We compared automatic digital melanoma classification with the performance of 145 dermatologists of 12 German university hospitals. We used methods from enhanced deep learning to train a CNN with 12,378 open-source dermoscopic images. We used 100 clinical images to compare the performance of the CNN to that of the dermatologists. Dermatologists were compared with the deep neural network in terms of sensitivity, specificity and receiver operating characteristics.

FINDINGS

The mean sensitivity and specificity achieved by the dermatologists with clinical images was 89.4% (range: 55.0%-100%) and 64.4% (range: 22.5%-92.5%). At the same sensitivity, the CNN exhibited a mean specificity of 68.2% (range 47.5%-86.25%). Among the dermatologists, the attendings showed the highest mean sensitivity of 92.8% at a mean specificity of 57.7%. With the same high sensitivity of 92.8%, the CNN had a mean specificity of 61.1%.

INTERPRETATION

For the first time, dermatologist-level image classification was achieved on a clinical image classification task without training on clinical images. The CNN had a smaller variance of results indicating a higher robustness of computer vision compared with human assessment for dermatologic image classification tasks.

摘要

背景

最近的研究表明,卷积神经网络(CNN)可用于对黑素瘤图像进行分类,其准确率可与经过委员会认证的皮肤科医生相媲美。然而,迄今为止,尚未对仅使用皮肤镜图像进行训练的 CNN 在与大量皮肤科医生的直接竞争中进行临床图像分类任务的性能进行测量。本研究比较了仅使用皮肤镜图像对临床照片中的黑色素瘤进行识别的卷积神经元网络与皮肤科医生对同一图像进行手动分级的性能。

方法

我们比较了自动数字黑素瘤分类与 12 家德国大学医院的 145 名皮肤科医生的表现。我们使用增强深度学习方法,用 12378 张开源皮肤镜图像对 CNN 进行了训练。我们使用 100 张临床图像来比较 CNN 与皮肤科医生的性能。皮肤科医生与深度神经网络在敏感性、特异性和接收者操作特征方面进行了比较。

结果

皮肤科医生使用临床图像的平均敏感性和特异性分别为 89.4%(范围:55.0%-100%)和 64.4%(范围:22.5%-92.5%)。在相同的敏感性下,CNN 表现出 68.2%(范围 47.5%-86.25%)的平均特异性。在皮肤科医生中,主治医生的平均敏感性最高,为 92.8%,平均特异性为 57.7%。在具有相同高敏感性的 92.8%时,CNN 的平均特异性为 61.1%。

解释

这是首次在不使用临床图像进行培训的情况下,在临床图像分类任务上实现了皮肤科医生级别的图像分类。CNN 的结果方差较小,这表明与皮肤科图像分类任务的人工评估相比,计算机视觉具有更高的稳健性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索