Miglioretti Diana L, Ichikawa Laura, Smith Robert A, Bassett Lawrence W, Feig Stephen A, Monsees Barbara, Parikh Jay R, Rosenberg Robert D, Sickles Edward A, Carney Patricia A
1 Division of Biostatistics, Department of Public Health Sciences, University of California Davis School of Medicine, One Shields Ave, Med Sci 1C, Rm 144, Davis, CA 95616.
AJR Am J Roentgenol. 2015 Apr;204(4):W486-91. doi: 10.2214/AJR.13.12313.
Using a combination of performance measures, we updated previously proposed criteria for identifying physicians whose performance interpreting screening mammography may indicate suboptimal interpretation skills.
In this study, six expert breast imagers used a method based on the Angoff approach to update criteria for acceptable mammography performance on the basis of two sets of combined performance measures: set 1, sensitivity and specificity for facilities with complete capture of false-negative cancers; and set 2, cancer detection rate (CDR), recall rate, and positive predictive value of a recall (PPV1) for facilities that cannot capture false-negative cancers but have reliable cancer follow-up information for positive mammography results. Decisions were informed by normative data from the Breast Cancer Surveillance Consortium (BCSC).
Updated combined ranges for acceptable sensitivity and specificity of screening mammography are sensitivity≥80% and specificity≥85% or sensitivity 75-79% and specificity 88-97%. Updated ranges for CDR, recall rate, and PPV1 are: CDR≥6 per 1000, recall rate 3-20%, and any PPV1; CDR 4-6 per 1000, recall rate 3-15%, and PPV1≥3%; or CDR 2.5-4.0 per 1000, recall rate 5-12%, and PPV1 3-8%. Using the original criteria, 51% of BCSC radiologists had acceptable sensitivity and specificity; 40% had acceptable CDR, recall rate, and PPV1. Using the combined criteria, 69% had acceptable sensitivity and specificity and 62% had acceptable CDR, recall rate, and PPV1.
The combined criteria improve previous criteria by considering the interrelationships of multiple performance measures and broaden the acceptable performance ranges compared with previous criteria based on individual measures.
我们运用多种绩效指标,更新了先前提出的用于识别在解读乳腺钼靶筛查影像时表现可能显示解读技能欠佳的医生的标准。
在本研究中,六位乳腺影像专家基于安格夫方法,采用一种方法,依据两组综合绩效指标更新乳腺钼靶检查可接受性能的标准:第一组,针对能完整捕捉假阴性癌症的机构的敏感度和特异度;第二组,针对无法捕捉假阴性癌症但对乳腺钼靶阳性结果有可靠癌症随访信息的机构的癌症检出率(CDR)、召回率和召回阳性预测值(PPV1)。决策依据来自乳腺癌监测联盟(BCSC)的规范数据。
更新后的乳腺钼靶筛查可接受敏感度和特异度的综合范围为敏感度≥80%且特异度≥85%,或敏感度75 - 79%且特异度88 - 97%。更新后的CDR、召回率和PPV1范围为:CDR≥每1000例中6例,召回率3 - 20%,以及任意PPV1;CDR每1000例中4 - 6例,召回率3 - 15%,且PPV1≥3%;或CDR每1000例中2.5 - 4.0例,召回率5 - 12%,且PPV1为3 - 8%。按照原标准,BCSC的放射科医生中51%的人敏感度和特异度可接受;40%的人CDR、召回率和PPV1可接受。按照综合标准,69%的人敏感度和特异度可接受,62% 的人CDR、召回率和PPV1可接受。
综合标准通过考虑多种绩效指标之间的相互关系改进了先前的标准,并且与基于单一指标的先前标准相比,拓宽了可接受的性能范围。