Department of Dermatology and Venereology, Institute of Clinical Sciences, Sahlgrenska Academy, University of Gothenburg, SE-413 45 Gothenburg, Sweden.
Acta Derm Venereol. 2022 Jul 13;102:adv00750. doi: 10.2340/actadv.v102.2028.
Research relating to machine learning algorithms, including convolutional neural networks, has increased during the past 5 years. The aim of this pilot study was to investigate how accurately a convolutional neural network, trained on Swedish registry data, could perform in predicting cutaneous invasive and in situ melanoma (CMM) within 5 years. A cohort of 1,208,393 individuals was used. Registry data ranged from 4 July 2005 to 31 December 2011, predicting CMM between 1 January 2012 and 31 December 2016. A convolutional neural network with one-dimensional convolutions with respect to time was trained using healthcare databases and registers. The algorithm was trained on 23,886 individuals. Validation was performed on a holdout validation set including 6,000 individuals. After training and validation, the convolutional neural network was evaluated on a test set (1,000 individuals with an CMM occurring within 5 years and 5,000 without). The area under the receiver-operating characteristic curve was 0.59 (95% confidence interval (95% CI) 0.57-0.61). The point on the receiver-operating characteristic curve where sensitivity equalled specificity had a value of 56% (sensitivity 95% CI 53-60% and specificity 95% CI 55-58%). Albeit at an early stage, this pilot investigation demonstrates potential usefulness for machine learning algorithms in predicting melanoma risk.
在过去的 5 年中,与机器学习算法相关的研究(包括卷积神经网络)有所增加。本试点研究的目的是调查在瑞典注册数据上进行训练的卷积神经网络在预测 5 年内皮肤浸润性和原位黑色素瘤(CMM)方面的准确性。使用了一个包含 1208393 人的队列。登记数据的范围从 2005 年 7 月 4 日至 2011 年 12 月 31 日,预测 2012 年 1 月 1 日至 2016 年 12 月 31 日之间的 CMM。使用医疗保健数据库和登记处对一维时间卷积的卷积神经网络进行了训练。该算法在 23886 个人身上进行了训练。在包括 6000 个人的保留验证集上进行了验证。在训练和验证之后,在一个测试集(1000 名在 5 年内发生 CMM 的个体和 5000 名没有发生 CMM 的个体)上评估了卷积神经网络。接收器工作特性曲线下的面积为 0.59(95%置信区间[95%CI]为 0.57-0.61)。接收器工作特性曲线上灵敏度等于特异性的点的值为 56%(灵敏度 95%CI 为 53-60%,特异性 95%CI 为 55-58%)。尽管处于早期阶段,但这项试点研究表明机器学习算法在预测黑色素瘤风险方面具有潜在的用处。