Guan Hewen, Yuan Qihang, Lv Kejia, Qi Yushuo, Jiang Yuankuan, Zhang Shumeng, Miao Dong, Wang Zhiyi, Lin Jingrong
Department of Dermatology, First Affiliated Hospital of Dalian Medical University, Dalian, Liaoning, China.
Department of General Surgery, First Affiliated Hospital of Dalian Medical University, Dalian, Liaoning, China.
J Cancer. 2024 Apr 23;15(11):3350-3361. doi: 10.7150/jca.94759. eCollection 2024.
This study has used machine learning algorithms to develop a predictive model for differentiating between dermoscopic images of basal cell carcinoma (BCC) and actinic keratosis (AK). We compiled a total of 904 dermoscopic images from two sources - the public dataset (HAM10000) and our proprietary dataset from the First Affiliated Hospital of Dalian Medical University (DAYISET 1) - and subsequently categorised these images into four distinct cohorts. The study developed a deep learning model for quantitative analysis of image features and integrated 15 machine learning algorithms, generating 207 algorithmic combinations through random combinations and cross-validation. The final predictive model, formed by integrating XGBoost with Lasso regression, exhibited effective performance in the differential diagnosis of BCC and AK. The model demonstrated high sensitivity in the training set and maintained stable performance in three validation sets. The area under the curve (AUC) value reached 1.000 in the training set and an average of 0.695 in the validation sets. The study concludes that the constructed discriminative diagnostic model based on machine learning algorithms has excellent predictive capabilities that could enhance clinical decision-making efficiency, reduce unnecessary biopsies, and provide valuable guidance for further treatment.
本研究使用机器学习算法开发了一种预测模型,用于区分基底细胞癌(BCC)和光化性角化病(AK)的皮肤镜图像。我们从两个来源共收集了904张皮肤镜图像——公共数据集(HAM10000)和来自大连医科大学附属第一医院的专有数据集(DAYISET 1)——随后将这些图像分为四个不同的队列。该研究开发了一种用于图像特征定量分析的深度学习模型,并整合了15种机器学习算法,通过随机组合和交叉验证生成了207种算法组合。通过将XGBoost与套索回归相结合形成的最终预测模型,在BCC和AK的鉴别诊断中表现出有效的性能。该模型在训练集中表现出高灵敏度,并在三个验证集中保持稳定性能。训练集中曲线下面积(AUC)值达到1.000,验证集中平均为0.695。该研究得出结论,基于机器学习算法构建的鉴别诊断模型具有出色的预测能力,可提高临床决策效率,减少不必要的活检,并为进一步治疗提供有价值的指导。