Zhao Jianhua, Lui Harvey, Kalia Sunil, Lee Tim K, Zeng Haishan
Photomedicine Institute, Department of Dermatology and Skin Science, University of British Columbia and Vancouver Coastal Health Research Institute, Vancouver, BC, Canada.
BC Cancer Research Institute, University of British Columbia, Vancouver, BC, Canada.
Front Oncol. 2024 Jun 19;14:1320220. doi: 10.3389/fonc.2024.1320220. eCollection 2024.
Our previous studies have demonstrated that Raman spectroscopy could be used for skin cancer detection with good sensitivity and specificity. The objective of this study is to determine if skin cancer detection can be further improved by combining deep neural networks and Raman spectroscopy.
Raman spectra of 731 skin lesions were included in this study, containing 340 cancerous and precancerous lesions (melanoma, basal cell carcinoma, squamous cell carcinoma and actinic keratosis) and 391 benign lesions (melanocytic nevus and seborrheic keratosis). One-dimensional convolutional neural networks (1D-CNN) were developed for Raman spectral classification. The stratified samples were divided randomly into training (70%), validation (10%) and test set (20%), and were repeated 56 times using parallel computing. Different data augmentation strategies were implemented for the training dataset, including added random noise, spectral shift, spectral combination and artificially synthesized Raman spectra using one-dimensional generative adversarial networks (1D-GAN). The area under the receiver operating characteristic curve (ROC AUC) was used as a measure of the diagnostic performance. Conventional machine learning approaches, including partial least squares for discriminant analysis (PLS-DA), principal component and linear discriminant analysis (PC-LDA), support vector machine (SVM), and logistic regression (LR) were evaluated for comparison with the same data splitting scheme as the 1D-CNN.
The ROC AUC of the test dataset based on the original training spectra were 0.886±0.022 (1D-CNN), 0.870±0.028 (PLS-DA), 0.875±0.033 (PC-LDA), 0.864±0.027 (SVM), and 0.525±0.045 (LR), which were improved to 0.909±0.021 (1D-CNN), 0.899±0.022 (PLS-DA), 0.895±0.022 (PC-LDA), 0.901±0.020 (SVM), and 0.897±0.021 (LR) respectively after augmentation of the training dataset (p<0.0001, Wilcoxon test). Paired analyses of 1D-CNN with conventional machine learning approaches showed that 1D-CNN had a 1-3% improvement (p<0.001, Wilcoxon test).
Data augmentation not only improved the performance of both deep neural networks and conventional machine learning techniques by 2-4%, but also improved the performance of the models on spectra with higher noise or spectral shifting. Convolutional neural networks slightly outperformed conventional machine learning approaches for skin cancer detection by Raman spectroscopy.
我们之前的研究表明,拉曼光谱可用于皮肤癌检测,具有良好的灵敏度和特异性。本研究的目的是确定通过结合深度神经网络和拉曼光谱是否可以进一步提高皮肤癌检测的效果。
本研究纳入了731个皮肤病变的拉曼光谱,其中包括340个癌性和癌前病变(黑色素瘤、基底细胞癌、鳞状细胞癌和光化性角化病)以及391个良性病变(黑素细胞痣和脂溢性角化病)。开发了一维卷积神经网络(1D-CNN)用于拉曼光谱分类。将分层样本随机分为训练集(70%)、验证集(10%)和测试集(20%),并使用并行计算重复56次。对训练数据集实施了不同的数据增强策略,包括添加随机噪声、光谱偏移、光谱组合以及使用一维生成对抗网络(1D-GAN)人工合成拉曼光谱。采用受试者操作特征曲线下面积(ROC AUC)作为诊断性能的衡量指标。评估了传统机器学习方法,包括判别分析偏最小二乘法(PLS-DA)、主成分和线性判别分析(PC-LDA)、支持向量机(SVM)和逻辑回归(LR),并与1D-CNN采用相同的数据分割方案进行比较。
基于原始训练光谱的测试数据集的ROC AUC分别为0.886±0.022(1D-CNN)、0.870±0.028(PLS-DA)、0.875±0.033(PC-LDA)、0.864±0.027(SVM)和0.525±0.045(LR),在训练数据集增强后分别提高到0.909±0.021(1D-CNN)、0.899±0.022(PLS-DA)、0.895±0.022(PC-LDA)、0.901±0.020(SVM)和0.897±0.021(LR)(p<0.0001,Wilcoxon检验)。1D-CNN与传统机器学习方法的配对分析表明,1D-CNN有1%-3%的提升(p<0.001,Wilcoxon检验)。
数据增强不仅使深度神经网络和传统机器学习技术的性能提高了2%-4%,还提高了模型在噪声更高或光谱偏移的光谱上的性能。卷积神经网络在通过拉曼光谱检测皮肤癌方面略优于传统机器学习方法。