Suppr超能文献

在皮肤镜图像中使用不同的比例尺与一款市售的用于黑色素瘤识别的深度学习卷积神经网络的诊断性能之间的关联。

Association between different scale bars in dermoscopic images and diagnostic performance of a market-approved deep learning convolutional neural network for melanoma recognition.

机构信息

Department of Dermatology, University of Heidelberg, Heidelberg, Germany.

Department of Research and Development, FotoFinder Systems GmbH, Bad Birnbach, Germany.

出版信息

Eur J Cancer. 2021 Mar;145:146-154. doi: 10.1016/j.ejca.2020.12.010. Epub 2021 Jan 16.

Abstract

BACKGROUND

Studies systematically unravelling possible causes for false diagnoses of deep learning convolutional neural networks (CNNs) are scarce, yet needed before broader application.

OBJECTIVES

The objective of the study was to investigate whether scale bars in dermoscopic images are associated with the diagnostic accuracy of a market-approved CNN.

METHODS

This cross-sectional analysis applied a CNN trained with more than 150,000 images (Moleanalyzer-pro®, FotoFinder Systems Inc., Bad Birnbach, Germany) to investigate seven dermoscopic image sets depicting the same 130 melanocytic lesions (107 nevi, 23 melanomas) without or with digitally superimposed scale bars of different manufacturers. Sensitivity, specificity and area under the curve (AUC) of receiver operating characteristics (ROC) for the CNN's binary classification of images with or without superimposed scale bars were assessed.

RESULTS

Six dermoscopic image sets with different scale bars and one control set without scale bars (overall 910 images) were submitted to CNN analysis. In images without scale bars, the CNN attained a sensitivity [95% confidence interval] of 87.0% [67.9%-95.5%] and a specificity of 87.9% [80.3%-92.8%]. ROC AUC was 0.953 [0.914-0.992]. Scale bars were not associated with significant changes in sensitivity (range 87%-95.7%, all p ≥ 1.0). However, four scale bars induced a decrease of the CNN's specificity (range 0%-43.9%, all p < 0.001). Moreover, ROC AUC was significantly reduced by two scale bars (range 0.520-0.848, both p ≤ 0.042).

CONCLUSIONS

Superimposed scale bars in dermoscopic images may impair the CNN's diagnostic accuracy, mostly by increasing the rate of the false-positive diagnoses. We recommend avoiding scale bars in images intended for CNN analysis unless specific measures counteracting effects are implemented.

CLINICAL TRIAL NUMBER

This study was registered at the German Clinical Trial Register (DRKS-Study-ID: DRKS00013570; URL: https://www.drks.de/drks_web/).

摘要

背景

系统研究深度学习卷积神经网络(CNN)误诊原因的研究很少,但在更广泛应用之前,这些研究是必要的。

目的

本研究旨在调查皮肤镜图像中的比例尺是否与市场批准的 CNN 的诊断准确性相关。

方法

本横断面分析应用了一个经过超过 150,000 张图像(Moleanalyzer-pro®, FotoFinder Systems Inc.,巴德宾巴赫,德国)训练的 CNN,以研究七个皮肤镜图像集,这些图像集中描绘了相同的 130 个黑素细胞病变(107 个痣,23 个黑色素瘤),有无不同制造商的数字叠加比例尺。评估了 CNN 对有/无叠加比例尺的图像进行二进制分类的敏感性、特异性和接收者操作特征(ROC)曲线下面积(AUC)。

结果

将六个具有不同比例尺的皮肤镜图像集和一个无比例尺的对照集(总共 910 张图像)提交给 CNN 分析。在无比例尺的图像中,CNN 的敏感性[95%置信区间]为 87.0%[67.9%-95.5%],特异性为 87.9%[80.3%-92.8%]。ROC AUC 为 0.953[0.914-0.992]。比例尺与敏感性的显著变化无关(范围 87%-95.7%,均 p≥1.0)。然而,四个比例尺导致 CNN 的特异性降低(范围 0%-43.9%,均 p<0.001)。此外,两个比例尺显著降低了 ROC AUC(范围 0.520-0.848,均 p≤0.042)。

结论

皮肤镜图像中的叠加比例尺可能会降低 CNN 的诊断准确性,主要是通过增加假阳性诊断的比率。我们建议在进行 CNN 分析时避免在图像中使用比例尺,除非实施了具体的措施来抵消影响。

临床试验编号

本研究在德国临床试验注册处(DRKS-Study-ID:DRKS00013570;网址:https://www.drks.de/drks_web/)注册。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验