Deng Jiawen, Guo Eddie, Zhao Heather Jianbo, Venugopal Kaden, Moskalyk Myron
Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada.
Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.
Cancer Inform. 2025 Jun 24;24:11769351251349891. doi: 10.1177/11769351251349891. eCollection 2025.
Early skin cancer detection in primary care settings is crucial for prognosis, yet clinicians often lack relevant training. Machine learning (ML) methods may offer a potential solution for this dilemma. This study aimed to develop a neural network for the binary classification of skin lesions into malignant and benign categories using smartphone images and clinical data via a multimodal and transfer learning-based approach.
We used the PAD-UFES-20 dataset, which included 2298 sets of lesion images. Three neural network models were developed: (1) a clinical data-based network, (2) an image-based network using a pre-trained DenseNet-121 and (3) a multimodal network combining clinical and image data. Models were tuned using Bayesian Optimisation HyperBand across 5-fold cross-validation. Model performance was evaluated using AUC-ROC, average precision, Brier score, calibration curve metrics, Matthews correlation coefficient (MCC), sensitivity and specificity. Model explainability was explored using permutation importance and Grad-CAM.
During cross-validation, the multimodal network achieved an AUC-ROC of 0.91 (95% confidence interval [CI] 0.88-0.93) and a Brier score of 0.15 (95% CI 0.11-0.19). During internal validation, it retained an AUC-ROC of 0.91 and a Brier score of 0.12. The multimodal network outperformed the unimodal models on threshold-independent metrics and at MCC-optimised threshold, but it had similar classification performance as the image-only model at high-sensitivity thresholds. Analysis of permutation importance showed that key clinical features influential for the clinical data-based network included bleeding, lesion elevation, patient age and recent lesion growth. Grad-CAM visualisations showed that the image-based network focused on lesioned regions during classification rather than background artefacts.
A transfer learning-based, multimodal neural network can accurately identify malignant skin lesions from smartphone images and clinical data. External validation with larger, more diverse datasets is needed to assess the model's generalisability and support clinical adoption.
在初级医疗环境中早期检测皮肤癌对预后至关重要,但临床医生往往缺乏相关培训。机器学习(ML)方法可能为这一困境提供潜在解决方案。本研究旨在通过多模态和基于迁移学习的方法,开发一种神经网络,用于使用智能手机图像和临床数据将皮肤病变分为恶性和良性类别。
我们使用了PAD-UFES-20数据集,其中包括2298组病变图像。开发了三种神经网络模型:(1)基于临床数据的网络,(2)使用预训练的DenseNet-121的基于图像的网络,以及(3)结合临床和图像数据的多模态网络。模型通过贝叶斯优化超带在5折交叉验证中进行调优。使用AUC-ROC、平均精度、布里尔分数、校准曲线指标、马修斯相关系数(MCC)、敏感性和特异性评估模型性能。使用排列重要性和Grad-CAM探索模型可解释性。
在交叉验证期间,多模态网络的AUC-ROC为0.91(95%置信区间[CI]0.88-0.93),布里尔分数为0.15(95%CI 0.11-0.19)。在内部验证期间,其AUC-ROC保持为0.91,布里尔分数为0.12。多模态网络在与阈值无关的指标上以及在MCC优化阈值时优于单模态模型,但在高敏感性阈值下其分类性能与仅图像模型相似。排列重要性分析表明,对基于临床数据的网络有影响的关键临床特征包括出血、病变隆起、患者年龄和近期病变生长。Grad-CAM可视化显示,基于图像的网络在分类过程中关注病变区域而非背景伪影。
基于迁移学习的多模态神经网络可以从智能手机图像和临床数据中准确识别恶性皮肤病变。需要使用更大、更多样化的数据集进行外部验证,以评估模型的通用性并支持临床应用。