用于皮肤癌分类的神经网络鲁棒性基准。

A benchmark for neural network robustness in skin cancer classification.

机构信息

Digital Biomarkers for Oncology Group, National Center for Tumor Diseases (NCT), German Cancer Research Center (DKFZ), Heidelberg, Germany.

Department of Dermatology and Allergy, University Hospital, LMU Munich, Munich, Germany.

出版信息

Eur J Cancer. 2021 Sep;155:191-199. doi: 10.1016/j.ejca.2021.06.047. Epub 2021 Aug 11.

DOI:10.1016/j.ejca.2021.06.047

PMID:34388516

Abstract

BACKGROUND

One prominent application for deep learning-based classifiers is skin cancer classification on dermoscopic images. However, classifier evaluation is often limited to holdout data which can mask common shortcomings such as susceptibility to confounding factors. To increase clinical applicability, it is necessary to thoroughly evaluate such classifiers on out-of-distribution (OOD) data.

OBJECTIVE

The objective of the study was to establish a dermoscopic skin cancer benchmark in which classifier robustness to OOD data can be measured.

METHODS

Using a proprietary dermoscopic image database and a set of image transformations, we create an OOD robustness benchmark and evaluate the robustness of four different convolutional neural network (CNN) architectures on it.

RESULTS

The benchmark contains three data sets-Skin Archive Munich (SAM), SAM-corrupted (SAM-C) and SAM-perturbed (SAM-P)-and is publicly available for download. To maintain the benchmark's OOD status, ground truth labels are not provided and test results should be sent to us for assessment. The SAM data set contains 319 unmodified and biopsy-verified dermoscopic melanoma (n = 194) and nevus (n = 125) images. SAM-C and SAM-P contain images from SAM which were artificially modified to test a classifier against low-quality inputs and to measure its prediction stability over small image changes, respectively. All four CNNs showed susceptibility to corruptions and perturbations.

CONCLUSIONS

This benchmark provides three data sets which allow for OOD testing of binary skin cancer classifiers. Our classifier performance confirms the shortcomings of CNNs and provides a frame of reference. Altogether, this benchmark should facilitate a more thorough evaluation process and thereby enable the development of more robust skin cancer classifiers.

摘要

背景

深度学习分类器的一个突出应用是基于皮肤镜图像的皮肤癌分类。然而，分类器的评估通常仅限于保留数据，这可能会掩盖常见的缺陷，如易受混杂因素的影响。为了提高临床适用性，有必要在离群（OOD）数据上对这类分类器进行彻底评估。

目的

本研究的目的是建立一个皮肤镜皮肤癌基准，以衡量分类器对 OOD 数据的稳健性。

方法

使用专有的皮肤镜图像数据库和一组图像变换，我们创建了一个 OOD 稳健性基准，并在其上评估了四种不同卷积神经网络（CNN）架构的稳健性。

结果

该基准包含三个数据集-Skin Archive Munich（SAM）、SAM-corrupted（SAM-C）和 SAM-perturbed（SAM-P）-并可公开下载。为了保持基准的 OOD 状态，不提供地面真实标签，并且应该将测试结果发送给我们进行评估。SAM 数据集包含 319 张未经修改且经过活检验证的皮肤镜黑色素瘤（n=194）和痣（n=125）图像。SAM-C 和 SAM-P 包含来自 SAM 的图像，这些图像经过人为修改，以测试分类器对低质量输入的适应能力，并衡量其在小图像变化下的预测稳定性。所有四个 CNN 都显示出对污染和扰动的敏感性。

结论

该基准提供了三个数据集，允许对二进制皮肤癌分类器进行 OOD 测试。我们的分类器性能证实了 CNN 的缺点，并提供了一个参考框架。总之，该基准应该有助于更彻底的评估过程，从而能够开发更稳健的皮肤癌分类器。

相似文献

A benchmark for neural network robustness in skin cancer classification.

Eur J Cancer. 2021 Sep;155:191-199. doi: 10.1016/j.ejca.2021.06.047. Epub 2021 Aug 11.

Robustness of convolutional neural networks in recognition of pigmented skin lesions.

Eur J Cancer. 2021 Mar;145:81-91. doi: 10.1016/j.ejca.2020.11.020. Epub 2021 Jan 7.

Reducing the Impact of Confounding Factors on Skin Cancer Classification via Image Segmentation: Technical Model Study.

J Med Internet Res. 2021 Mar 25;23(3):e21695. doi: 10.2196/21695.

Skin cancer classification via convolutional neural networks: systematic review of studies involving human experts.

Eur J Cancer. 2021 Oct;156:202-216. doi: 10.1016/j.ejca.2021.06.049. Epub 2021 Sep 8.

Comparing artificial intelligence algorithms to 157 German dermatologists: the melanoma classification benchmark.

Eur J Cancer. 2019 Apr;111:30-37. doi: 10.1016/j.ejca.2018.12.016. Epub 2019 Feb 22.

A convolutional neural network trained with dermoscopic images performed on par with 145 dermatologists in a clinical melanoma image classification task.

Eur J Cancer. 2019 Apr;111:148-154. doi: 10.1016/j.ejca.2019.02.005. Epub 2019 Mar 8.

ROOD-MRI: Benchmarking the robustness of deep learning segmentation models to out-of-distribution and corrupted data in MRI.

Neuroimage. 2023 Sep;278:120289. doi: 10.1016/j.neuroimage.2023.120289. Epub 2023 Jul 24.

Enhanced classifier training to improve precision of a convolutional neural network to identify images of skin lesions.

PLoS One. 2019 Jun 24;14(6):e0218713. doi: 10.1371/journal.pone.0218713. eCollection 2019.

Superior skin cancer classification by the combination of human and artificial intelligence.

Eur J Cancer. 2019 Oct;120:114-121. doi: 10.1016/j.ejca.2019.07.019. Epub 2019 Sep 10.

Systematic outperformance of 112 dermatologists in multiclass skin cancer image classification by convolutional neural networks.

Eur J Cancer. 2019 Sep;119:57-65. doi: 10.1016/j.ejca.2019.06.013. Epub 2019 Aug 14.

引用本文的文献

Research on the robustness of the open-world test-time training model.

Front Artif Intell. 2025 Aug 4;8:1621025. doi: 10.3389/frai.2025.1621025. eCollection 2025.

Towards the Generation of Medical Imaging Classifiers Robust to Common Perturbations.

BioMedInformatics. 2024 Jun;4(2):889-910. doi: 10.3390/biomedinformatics4020050. Epub 2024 Apr 1.

Skin Lesion Classification Through Test Time Augmentation and Explainable Artificial Intelligence.

J Imaging. 2025 Jan 9;11(1):15. doi: 10.3390/jimaging11010015.

Trained neural networking framework based skin cancer diagnosis and categorization using grey wolf optimization.

Sci Rep. 2024 Apr 24;14(1):9388. doi: 10.1038/s41598-024-59979-4.

Improving Skin Lesion Segmentation with Self-Training.

Cancers (Basel). 2024 Mar 11;16(6):1120. doi: 10.3390/cancers16061120.

Skin Cancer Segmentation and Classification Using Vision Transformer for Automatic Analysis in Dermatoscopy-Based Noninvasive Digital System.

Int J Biomed Imaging. 2024 Feb 3;2024:3022192. doi: 10.1155/2024/3022192. eCollection 2024.

An efficient multi-class classification of skin cancer using optimized vision transformer.

Med Biol Eng Comput. 2024 Mar;62(3):773-789. doi: 10.1007/s11517-023-02969-x. Epub 2023 Nov 23.

Generation of a Melanoma and Nevus Data Set From Unstandardized Clinical Photographs on the Internet.

JAMA Dermatol. 2023 Nov 1;159(11):1223-1231. doi: 10.1001/jamadermatol.2023.3521.

Assist-Dermo: A Lightweight Separable Vision Transformer Model for Multiclass Skin Lesion Classification.

Diagnostics (Basel). 2023 Jul 29;13(15):2531. doi: 10.3390/diagnostics13152531.

Semantic modeling of cell damage prediction: a machine learning approach at human-level performance in dermatology.

Sci Rep. 2023 May 23;13(1):8336. doi: 10.1038/s41598-023-35370-7.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于皮肤癌分类的神经网络鲁棒性基准。

A benchmark for neural network robustness in skin cancer classification.

机构信息

出版信息

BACKGROUND

OBJECTIVE

METHODS

RESULTS

CONCLUSIONS

背景

目的

方法

结果

结论

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献