Departments of Family Medicine and Public Health and Preventive Medicine, Oregon Health and Science University, 3181 SW Sam Jackson Park Rd, Portland, OR 97239-3098, USA.
Radiology. 2013 May;267(2):359-67. doi: 10.1148/radiol.12121216. Epub 2013 Jan 7.
To develop criteria to identify thresholds for the minimally acceptable performance of physicians interpreting diagnostic mammography studies.
In an institutional review board-approved HIPAA-compliant study, an Angoff approach was used to set criteria for identifying minimally acceptable interpretive performance for both workup after abnormal screening examinations and workup of a breast lump. Normative data from the Breast Cancer Surveillance Consortium (BCSC) was used to help the expert radiologist identify the impact of cut points. Simulations, also using data from the BCSC, were used to estimate the expected clinical impact from the recommended performance thresholds.
Final cut points for workup of abnormal screening examinations were as follows: sensitivity, less than 80%; specificity, less than 80% or greater than 95%; abnormal interpretation rate, less than 8% or greater than 25%; positive predictive value (PPV) of biopsy recommendation (PPV2), less than 15% or greater than 40%; PPV of biopsy performed (PPV3), less than 20% or greater than 45%; and cancer diagnosis rate, less than 20 per 1000 interpretations. Final cut points for workup of a breast lump were as follows: sensitivity, less than 85%; specificity, less than 83% or greater than 95%; abnormal interpretation rate, less than 10% or greater than 25%; PPV2, less than 25% or greater than 50%; PPV3, less than 30% or greater than 55%; and cancer diagnosis rate, less than 40 per 1000 interpretations. If underperforming physicians moved into the acceptable range after remedial training, the expected result would be (a) diagnosis of an additional 86 cancers per 100,000 women undergoing workup after screening examinations, with a reduction in the number of false-positive examinations by 1067 per 100,000 women undergoing this workup, and (b) diagnosis of an additional 335 cancers per 100,000 women undergoing workup of a breast lump, with a reduction in the number of false-positive examinations by 634 per 100,000 women undergoing this workup.
Interpreting physicians who fall outside one or more of the identified cut points should be reviewed in the context of an overall assessment of all their performance measures and their specific practice setting to determine if remedial training is indicated.
制定用于识别诊断性乳房 X 线摄影研究医师可接受最低表现的阈值标准。
在机构审查委员会批准的符合 HIPAA 规定的研究中,采用 Angoff 方法为以下内容设定了可接受的最低解释性能标准:异常筛查检查后的检查和乳房肿块检查。使用乳腺癌监测联盟(BCSC)的规范数据来帮助专家放射科医师确定切点的影响。还使用 BCSC 的数据进行模拟,以估计推荐性能阈值的预期临床影响。
异常筛查检查工作的最终切点如下:敏感性,小于 80%;特异性,小于 80%或大于 95%;异常解释率,小于 8%或大于 25%;活检推荐的阳性预测值(PPV2),小于 15%或大于 40%;进行活检的阳性预测值(PPV3),小于 20%或大于 45%;以及每 1000 次检查的癌症诊断率,小于 20%。乳房肿块检查的最终切点如下:敏感性,小于 85%;特异性,小于 83%或大于 95%;异常解释率,小于 10%或大于 25%;PPV2,小于 25%或大于 50%;PPV3,小于 30%或大于 55%;以及每 1000 次检查的癌症诊断率,小于 40%。如果表现不佳的医生在接受补救性培训后进入可接受的范围,则预期的结果是:(a)每 10 万名接受筛查后检查的女性中,额外诊断出 86 例癌症,每 10 万名接受该检查的女性中,假阳性检查减少 1067 例;(b)每 10 万名接受乳房肿块检查的女性中,额外诊断出 335 例癌症,每 10 万名接受该检查的女性中,假阳性检查减少 634 例。
如果一名解释医生不符合上述一个或多个切点,应根据其所有表现指标的整体评估及其特定的实践环境来审查,以确定是否需要补救性培训。