Carrera Cristina, Marchetti Michael A, Dusza Stephen W, Argenziano Giuseppe, Braun Ralph P, Halpern Allan C, Jaimes Natalia, Kittler Harald J, Malvehy Josep, Menzies Scott W, Pellacani Giovanni, Puig Susana, Rabinovitz Harold S, Scope Alon, Soyer H Peter, Stolz Wilhelm, Hofmann-Wellenhof Rainer, Zalaudek Iris, Marghoob Ashfaq A
Dermatology Service, Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, New York2Melanoma Unit, Department of Dermatology, Hospital Clinic Barcelona, Institut d'Investigacions Biomèdiques August Pi i Sunyer, University of Barcelona.
Dermatology Service, Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, New York.
JAMA Dermatol. 2016 Jul 1;152(7):798-806. doi: 10.1001/jamadermatol.2016.0624.
The comparative diagnostic performance of dermoscopic algorithms and their individual criteria are not well studied.
To analyze the discriminatory power and reliability of dermoscopic criteria used in melanoma detection and compare the diagnostic accuracy of existing algorithms.
DESIGN, SETTING, AND PARTICIPANTS: This was a retrospective, observational study of 477 lesions (119 melanomas [24.9%] and 358 nevi [75.1%]), which were divided into 12 image sets that consisted of 39 or 40 images per set. A link on the International Dermoscopy Society website from January 1, 2011, through December 31, 2011, directed participants to the study website. Data analysis was performed from June 1, 2013, through May 31, 2015. Participants included physicians, residents, and medical students, and there were no specialty-type or experience-level restrictions. Participants were randomly assigned to evaluate 1 of the 12 image sets.
Associations with melanoma and intraclass correlation coefficients (ICCs) were evaluated for the presence of dermoscopic criteria. Diagnostic accuracy measures were estimated for the following algorithms: the ABCD rule, the Menzies method, the 7-point checklist, the 3-point checklist, chaos and clues, and CASH (color, architecture, symmetry, and homogeneity).
A total of 240 participants registered, and 103 (42.9%) evaluated all images. The 110 participants (45.8%) who evaluated fewer than 20 lesions were excluded, resulting in data from 130 participants (54.2%), 121 (93.1%) of whom were regular dermoscopy users. Criteria associated with melanoma included marked architectural disorder (odds ratio [OR], 6.6; 95% CI, 5.6-7.8), pattern asymmetry (OR, 4.9; 95% CI, 4.1-5.8), nonorganized pattern (OR, 3.3; 95% CI, 2.9-3.7), border score of 6 (OR, 3.3; 95% CI, 2.5-4.3), and contour asymmetry (OR, 3.2; 95% CI, 2.7-3.7) (P < .001 for all). Most dermoscopic criteria had poor to fair interobserver agreement. Criteria that reached moderate levels of agreement included comma vessels (ICC, 0.44; 95% CI, 0.40-0.49), absence of vessels (ICC, 0.46; 95% CI, 0.42-0.51), dark brown color (ICC, 0.40; 95% CI, 0.35-0.44), and architectural disorder (ICC, 0.43; 95% CI, 0.39-0.48). The Menzies method had the highest sensitivity for melanoma diagnosis (95.1%) but the lowest specificity (24.8%) compared with any other method (P < .001). The ABCD rule had the highest specificity (59.4%). All methods had similar areas under the receiver operating characteristic curves.
Important dermoscopic criteria for melanoma recognition were revalidated by participants with varied experience. Six algorithms tested had similar but modest levels of diagnostic accuracy, and the interobserver agreement of most individual criteria was poor.
皮肤镜算法及其各个标准的比较诊断性能尚未得到充分研究。
分析用于黑色素瘤检测的皮肤镜标准的鉴别能力和可靠性,并比较现有算法的诊断准确性。
设计、设置和参与者:这是一项对477个病变(119个黑色素瘤[24.9%]和358个痣[75.1%])的回顾性观察研究,这些病变被分为12个图像集,每个图像集包含39或40张图像。2011年1月1日至2011年12月31日期间,国际皮肤镜协会网站上的一个链接将参与者引导至研究网站。数据分析于2013年6月1日至2015年5月31日进行。参与者包括医生、住院医师和医学生,没有专业类型或经验水平的限制。参与者被随机分配评估12个图像集中的1个。
评估皮肤镜标准的存在与黑色素瘤的关联以及组内相关系数(ICC)。估计以下算法的诊断准确性指标:ABCD规则、孟席斯方法、七点检查表、三点检查表、混乱与线索以及CASH(颜色、结构、对称性和同质性)。
共有240名参与者注册,103名(42.9%)评估了所有图像。排除了评估少于20个病变的110名参与者(45.8%),得到了130名参与者(54.2%)的数据,其中121名(93.1%)是常规皮肤镜使用者。与黑色素瘤相关的标准包括明显的结构紊乱(优势比[OR],6.6;95%置信区间,5.6 - 7.8)、形态不对称(OR,4.9;95%置信区间,4.1 - 5.8)、无组织形态(OR,3.3;95%置信区间,2.9 - 3.7)、边界评分为6(OR,3.3;95%置信区间,2.5 - 4.3)以及轮廓不对称(OR,3.2;95%置信区间,2.7 - 3.7)(所有P < 0.001)。大多数皮肤镜标准的观察者间一致性较差。达到中等一致性水平的标准包括逗号状血管(ICC,0.44;95%置信区间,0.40 - 0.49)、无血管(ICC,0.46;95%置信区间,0.42 - 0.51)、深棕色(ICC,0.40;95%置信区间,0.35 - 0.44)以及结构紊乱(ICC,0.43;95%置信区间,0.39 - 0.48)。与任何其他方法相比,孟席斯方法对黑色素瘤诊断的敏感性最高(95.1%),但特异性最低(24.8%)(P < 0.001)。ABCD规则的特异性最高(59.4%)。所有方法在受试者工作特征曲线下的面积相似。
不同经验的参与者重新验证了黑色素瘤识别的重要皮肤镜标准。测试的六种算法具有相似但适度的诊断准确性水平,并且大多数个体标准的观察者间一致性较差。