Dick Vincent, Sinz Christoph, Mittlböck Martina, Kittler Harald, Tschandl Philipp
ViDIR Group, Department of Dermatology, Medical University of Vienna, Vienna, Austria.
Center for Medical Statistics, Informatics and Intelligent Systems, Medical University of Vienna, Vienna, Austria.
JAMA Dermatol. 2019 Nov 1;155(11):1291-1299. doi: 10.1001/jamadermatol.2019.1375.
The recent advances in the field of machine learning have raised expectations that computer-aided diagnosis will become the standard for the diagnosis of melanoma.
To critically review the current literature and compare the diagnostic accuracy of computer-aided diagnosis with that of human experts.
The MEDLINE, arXiv, and PubMed Central databases were searched to identify eligible studies published between January 1, 2002, and December 31, 2018.
Studies that reported on the accuracy of automated systems for melanoma were selected. Search terms included melanoma, diagnosis, detection, computer aided, and artificial intelligence.
Evaluation of the risk of bias was performed using the QUADAS-2 tool, and quality assessment was based on predefined criteria. Data were analyzed from February 1 to March 10, 2019.
Summary estimates of sensitivity and specificity and summary receiver operating characteristic curves were the primary outcomes.
The literature search yielded 1694 potentially eligible studies, of which 132 were included and 70 offered sufficient information for a quantitative analysis. Most studies came from the field of computer science. Prospective clinical studies were rare. Combining the results for automated systems gave a melanoma sensitivity of 0.74 (95% CI, 0.66-0.80) and a specificity of 0.84 (95% CI, 0.79-0.88). Sensitivity was lower in studies that used independent test sets than in those that did not (0.51; 95% CI, 0.34-0.69 vs 0.82; 95% CI, 0.77-0.86; P < .001); however, the specificity was similar (0.83; 95% CI, 0.71-0.91 vs 0.85; 95% CI, 0.80-0.88; P = .67). In comparison with dermatologists' diagnosis, computer-aided diagnosis showed similar sensitivities and a 10 percentage points lower specificity, but the difference was not statistically significant. Studies were heterogeneous and substantial risk of bias was found in all but 4 of the 70 studies included in the quantitative analysis.
Although the accuracy of computer-aided diagnosis for melanoma detection is comparable to that of experts, the real-world applicability of these systems is unknown and potentially limited owing to overfitting and the risk of bias of the studies at hand.
机器学习领域的最新进展使人们期望计算机辅助诊断将成为黑色素瘤诊断的标准。
严格审查当前文献,并比较计算机辅助诊断与人类专家的诊断准确性。
检索MEDLINE、arXiv和PubMed Central数据库,以识别2002年1月1日至2018年12月31日期间发表的符合条件的研究。
选择报告自动系统对黑色素瘤诊断准确性的研究。检索词包括黑色素瘤、诊断、检测、计算机辅助和人工智能。
使用QUADAS-2工具评估偏倚风险,并根据预定义标准进行质量评估。2019年2月1日至3月10日对数据进行分析。
敏感性和特异性的汇总估计值以及汇总接受者操作特征曲线是主要结局。
文献检索产生了1694项潜在符合条件的研究,其中132项被纳入,70项提供了足够的信息进行定量分析。大多数研究来自计算机科学领域。前瞻性临床研究很少。综合自动系统的结果,黑色素瘤的敏感性为0.74(95%CI,0.66-0.80),特异性为0.84(95%CI,0.79-0.88)。使用独立测试集的研究中的敏感性低于未使用独立测试集的研究(0.51;95%CI,0.34-0.69对0.82;95%CI,0.77-0.86;P<0.001);然而,特异性相似(0.83;95%CI,0.71-0.91对0.85;95%CI,0.80-0.88;P = 0.67)。与皮肤科医生的诊断相比,计算机辅助诊断显示出相似的敏感性,但特异性低10个百分点,但差异无统计学意义。研究具有异质性,在定量分析纳入的70项研究中,除4项外,所有研究均存在实质性偏倚风险。
尽管计算机辅助诊断在黑色素瘤检测方面的准确性与专家相当,但由于过度拟合和现有研究的偏倚风险,这些系统在现实世界中的适用性尚不清楚且可能有限。