Dinnes Jacqueline, Deeks Jonathan J, Chuchu Naomi, Ferrante di Ruffano Lavinia, Matin Rubeta N, Thomson David R, Wong Kai Yuen, Aldridge Roger Benjamin, Abbott Rachel, Fawzy Monica, Bayliss Susan E, Grainge Matthew J, Takwoingi Yemisi, Davenport Clare, Godfrey Kathie, Walter Fiona M, Williams Hywel C
Institute of Applied Health Research, University of Birmingham, Birmingham, UK, B15 2TT.
Cochrane Database Syst Rev. 2018 Dec 4;12(12):CD011902. doi: 10.1002/14651858.CD011902.pub2.
Melanoma has one of the fastest rising incidence rates of any cancer. It accounts for a small percentage of skin cancer cases but is responsible for the majority of skin cancer deaths. Although history-taking and visual inspection of a suspicious lesion by a clinician are usually the first in a series of 'tests' to diagnose skin cancer, dermoscopy has become an important tool to assist diagnosis by specialist clinicians and is increasingly used in primary care settings. Dermoscopy is a magnification technique using visible light that allows more detailed examination of the skin compared to examination by the naked eye alone. Establishing the additive value of dermoscopy over and above visual inspection alone across a range of observers and settings is critical to understanding its contribution for the diagnosis of melanoma and to future understanding of the potential role of the growing number of other high-resolution image analysis techniques.
To determine the diagnostic accuracy of dermoscopy alone, or when added to visual inspection of a skin lesion, for the detection of cutaneous invasive melanoma and atypical intraepidermal melanocytic variants in adults. We separated studies according to whether the diagnosis was recorded face-to-face (in-person), or based on remote (image-based), assessment.
We undertook a comprehensive search of the following databases from inception up to August 2016: CENTRAL; MEDLINE; Embase; CINAHL; CPCI; Zetoc; Science Citation Index; US National Institutes of Health Ongoing Trials Register; NIHR Clinical Research Network Portfolio Database; and the World Health Organization International Clinical Trials Registry Platform. We studied reference lists and published systematic review articles.
Studies of any design that evaluated dermoscopy in adults with lesions suspicious for melanoma, compared with a reference standard of either histological confirmation or clinical follow-up. Data on the accuracy of visual inspection, to allow comparisons of tests, was included only if reported in the included studies of dermoscopy.
Two review authors independently extracted all data using a standardised data extraction and quality assessment form (based on QUADAS-2). We contacted authors of included studies where information related to the target condition or diagnostic threshold were missing. We estimated accuracy using hierarchical summary receiver operating characteristic (SROC),methods. Analysis of studies allowing direct comparison between tests was undertaken. To facilitate interpretation of results, we computed values of sensitivity at the point on the SROC curve with 80% fixed specificity and values of specificity with 80% fixed sensitivity. We investigated the impact of in-person test interpretation; use of a purposely developed algorithm to assist diagnosis; observer expertise; and dermoscopy training.
We included a total of 104 study publications reporting on 103 study cohorts with 42,788 lesions (including 5700 cases), providing 354 datasets for dermoscopy. The risk of bias was mainly low for the index test and reference standard domains and mainly high or unclear for participant selection and participant flow. Concerns regarding the applicability of study findings were largely scored as 'high' concern in three of four domains assessed. Selective participant recruitment, lack of reproducibility of diagnostic thresholds and lack of detail on observer expertise were particularly problematic.The accuracy of dermoscopy for the detection of invasive melanoma or atypical intraepidermal melanocytic variants was reported in 86 datasets; 26 for evaluations conducted in person (dermoscopy added to visual inspection), and 60 for image-based evaluations (diagnosis based on interpretation of dermoscopic images). Analyses of studies by prior testing revealed no obvious effect on accuracy; analyses were hampered by the lack of studies in primary care, lack of relevant information and the restricted inclusion of lesions selected for biopsy or excision. Accuracy was higher for in-person diagnosis compared to image-based evaluations (relative diagnostic odds ratio (RDOR) 4.6, 95% confidence interval (CI) 2.4 to 9.0; P < 0.001).We compared accuracy for (a), in-person evaluations of dermoscopy (26 evaluations; 23,169 lesions and 1664 melanomas),versus visual inspection alone (13 evaluations; 6740 lesions and 459 melanomas), and for (b), image-based evaluations of dermoscopy (60 evaluations; 13,475 lesions and 2851 melanomas),versus image-based visual inspection (11 evaluations; 1740 lesions and 305 melanomas). For both comparisons, meta-analysis found dermoscopy to be more accurate than visual inspection alone, with RDORs of (a), 4.7 (95% CI 3.0 to 7.5; P < 0.001), and (b), 5.6 (95% CI 3.7 to 8.5; P < 0.001). For a), the predicted difference in sensitivity at a fixed specificity of 80% was 16% (95% CI 8% to 23%; 92% for dermoscopy + visual inspection versus 76% for visual inspection), and predicted difference in specificity at a fixed sensitivity of 80% was 20% (95% CI 7% to 33%; 95% for dermoscopy + visual inspection versus 75% for visual inspection). For b) the predicted differences in sensitivity was 34% (95% CI 24% to 46%; 81% for dermoscopy versus 47% for visual inspection), at a fixed specificity of 80%, and predicted difference in specificity was 40% (95% CI 27% to 57%; 82% for dermoscopy versus 42% for visual inspection), at a fixed sensitivity of 80%.Using the median prevalence of disease in each set of studies ((a), 12% for in-person and (b), 24% for image-based), for a hypothetical population of 1000 lesions, an increase in sensitivity of (a), 16% (in-person), and (b), 34% (image-based), from using dermoscopy at a fixed specificity of 80% equates to a reduction in the number of melanomas missed of (a), 19 and (b), 81 with (a), 176 and (b), 152 false positive results. An increase in specificity of (a), 20% (in-person), and (b), 40% (image-based), at a fixed sensitivity of 80% equates to a reduction in the number of unnecessary excisions from using dermoscopy of (a), 176 and (b), 304 with (a), 24 and (b), 48 melanomas missed.The use of a named or published algorithm to assist dermoscopy interpretation (as opposed to no reported algorithm or reported use of pattern analysis), had no significant impact on accuracy either for in-person (RDOR 1.4, 95% CI 0.34 to 5.6; P = 0.17), or image-based (RDOR 1.4, 95% CI 0.60 to 3.3; P = 0.22), evaluations. This result was supported by subgroup analysis according to algorithm used. We observed higher accuracy for observers reported as having high experience and for those classed as 'expert consultants' in comparison to those considered to have less experience in dermoscopy, particularly for image-based evaluations. Evidence for the effect of dermoscopy training on test accuracy was very limited but suggested associated improvements in sensitivity.
AUTHORS' CONCLUSIONS: Despite the observed limitations in the evidence base, dermoscopy is a valuable tool to support the visual inspection of a suspicious skin lesion for the detection of melanoma and atypical intraepidermal melanocytic variants, particularly in referred populations and in the hands of experienced users. Data to support its use in primary care are limited, however, it may assist in triaging suspicious lesions for urgent referral when employed by suitably trained clinicians. Formal algorithms may be of most use for dermoscopy training purposes and for less expert observers, however reliable data comparing approaches using dermoscopy in person are lacking.
黑色素瘤是所有癌症中发病率上升速度最快的癌症之一。它在皮肤癌病例中所占比例较小,但却是导致大多数皮肤癌死亡的原因。尽管临床医生对可疑病变进行病史采集和肉眼检查通常是诊断皮肤癌的一系列“检查”中的第一步,但皮肤镜检查已成为专科临床医生辅助诊断的重要工具,并越来越多地应用于基层医疗环境中。皮肤镜检查是一种利用可见光的放大技术,与仅用肉眼检查相比,它能更详细地检查皮肤。确定皮肤镜检查相对于仅肉眼检查在一系列观察者和环境中的附加价值,对于理解其对黑色素瘤诊断的贡献以及未来理解越来越多的其他高分辨率图像分析技术的潜在作用至关重要。
确定单独使用皮肤镜检查或在对皮肤病变进行肉眼检查的基础上增加皮肤镜检查,对检测成人皮肤侵袭性黑色素瘤和非典型表皮内黑素细胞变异型的诊断准确性。我们根据诊断是通过面对面(现场)记录还是基于远程(基于图像)评估来对研究进行分类。
我们对以下数据库从创建至2016年8月进行了全面检索:Cochrane系统评价数据库;医学索引数据库;荷兰医学文摘数据库;护理学与健康领域数据库;会议论文引文索引数据库;Zetoc数据库;科学引文索引;美国国立卫生研究院正在进行的试验注册库;英国国家卫生研究院临床研究网络组合数据库;以及世界卫生组织国际临床试验注册平台。我们研究了参考文献列表并查阅了已发表的系统评价文章。
任何设计的研究,只要评估了对有黑色素瘤可疑病变的成人进行皮肤镜检查,并与组织学确诊或临床随访的参考标准进行比较。仅当在纳入的皮肤镜检查研究中报告了肉眼检查准确性的数据时,才将其纳入以进行检查比较。
两位综述作者使用标准化的数据提取和质量评估表(基于QUADAS - 2)独立提取所有数据。对于纳入研究中与目标疾病或诊断阈值相关信息缺失的情况,我们与研究作者进行了联系。我们使用分层汇总接受者操作特征(SROC)方法估计准确性。对允许进行检查直接比较的研究进行了分析。为便于结果解释,我们计算了SROC曲线上固定特异性为80%时的敏感度值以及固定敏感度为80%时的特异度值。我们研究了现场检查解读的影响;使用专门开发的算法辅助诊断;观察者专业知识;以及皮肤镜检查培训。
我们共纳入了104篇研究出版物,报告了103个研究队列中的42,788个病变(包括5700例病例),提供了354个皮肤镜检查数据集。索引检查和参考标准领域的偏倚风险主要为低,而参与者选择和参与者流程领域的偏倚风险主要为高或不明确。在所评估的四个领域中的三个领域,对研究结果适用性的担忧大多被评为“高”关注度。选择性参与者招募、诊断阈值缺乏可重复性以及观察者专业知识细节不足尤其成问题。86个数据集中报告了皮肤镜检查对检测侵袭性黑色素瘤或非典型表皮内黑素细胞变异型的准确性;26个用于现场评估(皮肤镜检查加肉眼检查),60个用于基于图像的评估(基于皮肤镜图像解读进行诊断)。对预先测试的研究进行分析未发现对准确性有明显影响;由于基层医疗研究的缺乏、相关信息的缺失以及活检或切除所选病变纳入受限,分析受到阻碍。现场诊断的准确性高于基于图像的评估(相对诊断比值比(RDOR)为4.6,95%置信区间(CI)为2.4至9.0;P < 0.001)。我们比较了(a)现场皮肤镜检查评估(26次评估;23,169个病变和1664例黑色素瘤)与仅肉眼检查(13次评估;6740个病变和459例黑色素瘤)的准确性,以及(b)基于图像的皮肤镜检查评估(60次评估;13,475个病变和2851例黑色素瘤)与基于图像的肉眼检查(11次评估;1740个病变和305例黑色素瘤)的准确性。对于这两个比较,荟萃分析发现皮肤镜检查比仅肉眼检查更准确,(a)的RDOR为4.7(95%CI为3.0至7.5;P < 0.001),(b)的RDOR为5.6(95%CI为3.7至8.5;P < 0.001)。对于(a),在固定特异性为80%时,预测的敏感度差异为16%(95%CI为8%至23%;皮肤镜检查 + 肉眼检查为92%,肉眼检查为76%),在固定敏感度为80%时,预测的特异度差异为20%(95%CI为7%至33%;皮肤镜检查 + 肉眼检查为95%,肉眼检查为75%)。对于(b),在固定特异性为80%时,预测的敏感度差异为34%(95%CI为24%至46%;皮肤镜检查为81%,肉眼检查为47%),在固定敏感度为80%时,预测的特异度差异为40%(95%CI为27%至57%;皮肤镜检查为82%,肉眼检查为42%)。使用每组研究中疾病的中位数患病率((a)现场为12%,(b)基于图像为24%),对于假设的1000个病变群体,在固定特异性为80%时使用皮肤镜检查导致敏感度增加(a)现场为16%,(b)基于图像为34%,相当于黑色素瘤漏诊数量减少(a)19例,(b)81例,假阳性结果分别为(a)176例,(b)152例。在固定敏感度为80%时,特异度增加(a)现场为20%,(b)基于图像为40%,相当于使用皮肤镜检查减少不必要切除数量(a)176例,(b)304例,黑色素瘤漏诊分别为(a)24例,(b)48例。使用命名或已发表的算法辅助皮肤镜检查解读(与未报告算法或报告使用模式分析相反),对现场(RDOR为1.4,95%CI为0.34至5.6;P = 0.17)或基于图像(RDOR为1.4,95%CI为0.60至3.3;P = 0.22)评估的评估的评估准确性均无显著影响。根据所使用的算法进行的亚组分析支持了这一结果。我们观察到,与那些被认为皮肤镜检查经验较少的观察者相比,报告为经验丰富的观察者以及被归类为“专家顾问”的观察者的准确性更高,特别是对于基于图像的评估。关于皮肤镜检查培训对检查准确性影响的证据非常有限,但表明敏感度有相关提高。
尽管在证据基础方面存在观察到的局限性,但皮肤镜检查是支持对可疑皮肤病变进行肉眼检查以检测黑色素瘤和非典型表皮内黑素细胞变异型的有价值工具,特别是在转诊人群和经验丰富的使用者手中。然而,支持其在基层医疗中使用的数据有限,不过当由经过适当培训的临床医生使用时,它可能有助于对可疑病变进行分诊以便紧急转诊。正式算法可能对皮肤镜检查培训目的和经验较少的观察者最有用,然而缺乏可靠的数据来比较现场使用皮肤镜检查的方法。