Phillips Michael, Greenhalgh Jack, Marsden Helen, Palamaras Ioulios
Royal Perth Hospital, Perth, Australia; Harry Perkins Institute for Medical Research, Perth, Australia; and Centre for Medical Research, University of Western Australia, Perth, Australia.
Skin Analytics Ltd., London, UK.
Dermatol Pract Concept. 2019 Dec 31;10(1):e2020011. doi: 10.5826/dpc.1001a11. eCollection 2020.
Malignant melanoma can most successfully be cured when diagnosed at an early stage in the natural history. However, there is controversy over screening programs and many advocate screening only for high-risk individuals.
This study aimed to evaluate the accuracy of an artificial intelligence neural network (Deep Ensemble for Recognition of Melanoma [DERM]) to identify malignant melanoma from dermoscopic images of pigmented skin lesions and to show how this compared to doctors' performance assessed by meta-analysis.
DERM was trained and tested using 7,102 dermoscopic images of both histologically confirmed melanoma (24%) and benign pigmented lesions (76%). A meta-analysis was conducted of studies examining the accuracy of naked-eye examination, with or without dermoscopy, by specialist and general physicians whose clinical diagnosis was compared to histopathology. The meta-analysis was based on evaluation of 32,226 pigmented lesions including 3,277 histopathology-confirmed malignant melanoma cases. The receiver operating characteristic (ROC) curve was used to examine and compare the diagnostic accuracy.
DERM achieved a ROC area under the curve (AUC) of 0.93 (95% confidence interval: 0.92-0.94), and sensitivity and specificity of 85.0% and 85.3%, respectively. Avoidance of false-negative results is essential, so different decision thresholds were examined. At 95% sensitivity DERM achieved a specificity of 64.1% and at 95% specificity the sensitivity was 67%. The meta-analysis showed primary care physicians (10 studies) achieve an AUC of 0.83 (95% confidence interval: 0.79-0.86), with sensitivity and specificity of 79.9% and 70.9%; and dermatologists (92 studies) 0.91 (0.88-0.93), 87.5%, and 81.4%, respectively.
DERM has the potential to be used as a decision support tool in primary care, by providing dermatologist-grade recommendation on the likelihood of malignant melanoma.
恶性黑色素瘤若在自然病程的早期阶段被诊断出来,最有可能成功治愈。然而,对于筛查项目存在争议,许多人主张仅对高危个体进行筛查。
本研究旨在评估人工智能神经网络(黑色素瘤识别深度集成模型[DERM])从色素沉着性皮肤病变的皮肤镜图像中识别恶性黑色素瘤的准确性,并展示其与通过荟萃分析评估的医生表现相比如何。
使用7102张组织学确诊的黑色素瘤(24%)和良性色素沉着性病变(76%)的皮肤镜图像对DERM进行训练和测试。对由专科医生和普通医生进行的肉眼检查(无论有无皮肤镜检查)准确性的研究进行荟萃分析,将他们的临床诊断与组织病理学结果进行比较。该荟萃分析基于对32226个色素沉着性病变的评估,其中包括3277例经组织病理学确诊的恶性黑色素瘤病例。采用受试者操作特征(ROC)曲线来检验和比较诊断准确性。
DERM的曲线下面积(AUC)为0.93(95%置信区间:0.92 - 0.94),敏感性和特异性分别为85.0%和85.3%。避免假阴性结果至关重要,因此研究了不同的决策阈值。在敏感性为95%时,DERM的特异性为64.1%;在特异性为95%时,敏感性为67%。荟萃分析显示,初级保健医生(10项研究)的AUC为0.83(95%置信区间:0.79 - 0.86),敏感性和特异性分别为79.9%和70.9%;皮肤科医生(92项研究)的AUC为0.91(0.88 - 0.93),敏感性和特异性分别为87.5%和81.4%。
DERM有潜力在初级保健中用作决策支持工具,通过提供皮肤科医生级别的关于恶性黑色素瘤可能性的建议。