Ferrante di Ruffano Lavinia, Takwoingi Yemisi, Dinnes Jacqueline, Chuchu Naomi, Bayliss Susan E, Davenport Clare, Matin Rubeta N, Godfrey Kathie, O'Sullivan Colette, Gulati Abha, Chan Sue Ann, Durack Alana, O'Connell Susan, Gardiner Matthew D, Bamber Jeffrey, Deeks Jonathan J, Williams Hywel C
Institute of Applied Health Research, University of Birmingham, Edgbaston Campus, Birmingham, UK, B15 2TT.
Cochrane Database Syst Rev. 2018 Dec 4;12(12):CD013186. doi: 10.1002/14651858.CD013186.
Early accurate detection of all skin cancer types is essential to guide appropriate management and to improve morbidity and survival. Melanoma and cutaneous squamous cell carcinoma (cSCC) are high-risk skin cancers which have the potential to metastasise and ultimately lead to death, whereas basal cell carcinoma (BCC) is usually localised with potential to infiltrate and damage surrounding tissue. Anxiety around missing early curable cases needs to be balanced against inappropriate referral and unnecessary excision of benign lesions. Computer-assisted diagnosis (CAD) systems use artificial intelligence to analyse lesion data and arrive at a diagnosis of skin cancer. When used in unreferred settings ('primary care'), CAD may assist general practitioners (GPs) or other clinicians to more appropriately triage high-risk lesions to secondary care. Used alongside clinical and dermoscopic suspicion of malignancy, CAD may reduce unnecessary excisions without missing melanoma cases.
To determine the accuracy of CAD systems for diagnosing cutaneous invasive melanoma and atypical intraepidermal melanocytic variants, BCC or cSCC in adults, and to compare its accuracy with that of dermoscopy.
We undertook a comprehensive search of the following databases from inception up to August 2016: Cochrane Central Register of Controlled Trials (CENTRAL); MEDLINE; Embase; CINAHL; CPCI; Zetoc; Science Citation Index; US National Institutes of Health Ongoing Trials Register; NIHR Clinical Research Network Portfolio Database; and the World Health Organization International Clinical Trials Registry Platform. We studied reference lists and published systematic review articles.
Studies of any design that evaluated CAD alone, or in comparison with dermoscopy, in adults with lesions suspicious for melanoma or BCC or cSCC, and compared with a reference standard of either histological confirmation or clinical follow-up.
Two review authors independently extracted all data using a standardised data extraction and quality assessment form (based on QUADAS-2). We contacted authors of included studies where information related to the target condition or diagnostic threshold were missing. We estimated summary sensitivities and specificities separately by type of CAD system, using the bivariate hierarchical model. We compared CAD with dermoscopy using (a) all available CAD data (indirect comparisons), and (b) studies providing paired data for both tests (direct comparisons). We tested the contribution of human decision-making to the accuracy of CAD diagnoses in a sensitivity analysis by removing studies that gave CAD results to clinicians to guide diagnostic decision-making.
We included 42 studies, 24 evaluating digital dermoscopy-based CAD systems (Derm-CAD) in 23 study cohorts with 9602 lesions (1220 melanomas, at least 83 BCCs, 9 cSCCs), providing 32 datasets for Derm-CAD and seven for dermoscopy. Eighteen studies evaluated spectroscopy-based CAD (Spectro-CAD) in 16 study cohorts with 6336 lesions (934 melanomas, 163 BCC, 49 cSCCs), providing 32 datasets for Spectro-CAD and six for dermoscopy. These consisted of 15 studies using multispectral imaging (MSI), two studies using electrical impedance spectroscopy (EIS) and one study using diffuse-reflectance spectroscopy. Studies were incompletely reported and at unclear to high risk of bias across all domains. Included studies inadequately address the review question, due to an abundance of low-quality studies, poor reporting, and recruitment of highly selected groups of participants.Across all CAD systems, we found considerable variation in the hardware and software technologies used, the types of classification algorithm employed, methods used to train the algorithms, and which lesion morphological features were extracted and analysed across all CAD systems, and even between studies evaluating CAD systems. Meta-analysis found CAD systems had high sensitivity for correct identification of cutaneous invasive melanoma and atypical intraepidermal melanocytic variants in highly selected populations, but with low and very variable specificity, particularly for Spectro-CAD systems. Pooled data from 22 studies estimated the sensitivity of Derm-CAD for the detection of melanoma as 90.1% (95% confidence interval (CI) 84.0% to 94.0%) and specificity as 74.3% (95% CI 63.6% to 82.7%). Pooled data from eight studies estimated the sensitivity of multispectral imaging CAD (MSI-CAD) as 92.9% (95% CI 83.7% to 97.1%) and specificity as 43.6% (95% CI 24.8% to 64.5%). When applied to a hypothetical population of 1000 lesions at the mean observed melanoma prevalence of 20%, Derm-CAD would miss 20 melanomas and would lead to 206 false-positive results for melanoma. MSI-CAD would miss 14 melanomas and would lead to 451 false diagnoses for melanoma. Preliminary findings suggest CAD systems are at least as sensitive as assessment of dermoscopic images for the diagnosis of invasive melanoma and atypical intraepidermal melanocytic variants. We are unable to make summary statements about the use of CAD in unreferred populations, or its accuracy in detecting keratinocyte cancers, or its use in any setting as a diagnostic aid, because of the paucity of studies.
AUTHORS' CONCLUSIONS: In highly selected patient populations all CAD types demonstrate high sensitivity, and could prove useful as a back-up for specialist diagnosis to assist in minimising the risk of missing melanomas. However, the evidence base is currently too poor to understand whether CAD system outputs translate to different clinical decision-making in practice. Insufficient data are available on the use of CAD in community settings, or for the detection of keratinocyte cancers. The evidence base for individual systems is too limited to draw conclusions on which might be preferred for practice. Prospective comparative studies are required that evaluate the use of already evaluated CAD systems as diagnostic aids, by comparison to face-to-face dermoscopy, and in participant populations that are representative of those in which the test would be used in practice.
早期准确检测所有类型的皮肤癌对于指导恰当的治疗以及改善发病率和生存率至关重要。黑色素瘤和皮肤鳞状细胞癌(cSCC)是高危皮肤癌,有转移的可能并最终导致死亡,而基底细胞癌(BCC)通常局限于局部,但有浸润和损害周围组织的可能。对于漏诊早期可治愈病例的担忧需要与不恰当的转诊以及对良性病变的不必要切除相权衡。计算机辅助诊断(CAD)系统利用人工智能分析病变数据并得出皮肤癌的诊断结果。当在未经转诊的环境(“基层医疗”)中使用时,CAD可协助全科医生(GPs)或其他临床医生更恰当地将高危病变分诊至二级医疗。与临床和皮肤镜下对恶性肿瘤的怀疑一起使用时,CAD可减少不必要的切除,同时不会漏诊黑色素瘤病例。
确定CAD系统诊断成人皮肤浸润性黑色素瘤、非典型表皮内黑素细胞变异型、BCC或cSCC的准确性,并将其准确性与皮肤镜检查的准确性进行比较。
我们对以下数据库从建库至2016年8月进行了全面检索:Cochrane对照试验中心注册库(CENTRAL);医学期刊数据库(MEDLINE);荷兰医学文摘数据库(Embase);护理学与健康领域数据库(CINAHL);会议论文引文索引数据库(CPCI);Zetoc数据库;科学引文索引;美国国立卫生研究院正在进行的试验注册库;英国国家卫生研究院临床研究网络组合数据库;以及世界卫生组织国际临床试验注册平台。我们研究了参考文献列表并查阅了已发表的系统评价文章。
对任何设计的研究进行评估,这些研究单独评估CAD,或与皮肤镜检查进行比较,研究对象为有黑色素瘤或BCC或cSCC可疑病变的成人,并与组织学确诊或临床随访的参考标准进行比较。
两位综述作者使用标准化的数据提取和质量评估表(基于QUADAS - 2)独立提取所有数据。对于纳入研究中缺少与目标疾病或诊断阈值相关信息的情况,我们与研究作者进行了联系。我们使用双变量分层模型,按CAD系统类型分别估计汇总敏感性和特异性。我们使用(a)所有可用的CAD数据(间接比较),以及(b)为两种检测提供配对数据的研究(直接比较),将CAD与皮肤镜检查进行比较。在一项敏感性分析中,我们通过剔除那些将CAD结果提供给临床医生以指导诊断决策的研究,来检验人为决策对CAD诊断准确性的贡献。
我们纳入了42项研究,其中24项在23个研究队列中评估了基于数字皮肤镜的CAD系统(Derm - CAD),涉及9602个病变(1220例黑色素瘤、至少83例BCC、9例cSCC),为Derm - CAD提供了32个数据集,为皮肤镜检查提供了7个数据集。18项研究在16个研究队列中评估了基于光谱学的CAD(Spectro - CAD),涉及6336个病变(934例黑色素瘤、