Timmers J M H, Verbeek A L M, Pijnappel R M, Broeders M J M, den Heeten G J
National Expert and Training Centre for Breast Cancer Screening, PO Box 6873, 6503 GJ, Nijmegen, The Netherlands,
Eur Radiol. 2014 Feb;24(2):294-304. doi: 10.1007/s00330-013-3018-4. Epub 2013 Sep 22.
To evaluate a self-test for Dutch breast screening radiologists introduced as part of the national quality assurance programme.
A total of 144 radiologists were invited to complete a test-set of 60 screening mammograms (20 malignancies). Participants assigned findings such as location, lesion type and BI-RADS. We determined areas under the receiver operating characteristics (ROC) curves (AUC), case and lesion sensitivity and specificity, agreement (kappa) and correlation between reader characteristics and case sensitivity (Spearman correlation coefficients).
A total of 110 radiologists completed the test (76%). Participants read a median number of 10,000 screening mammograms/year. Median AUC value was 0.93, case and lesion sensitivity was 91% and case specificity 94%. We found substantial agreement for recall (κ = 0.77) and laterality (κ = 0.80), moderate agreement for lesion type (κ = 0.57) and BI-RADS (κ = 0.45) and no correlation between case sensitivity and reader characteristics.
Areas under the ROC curve, case sensitivity and lesion sensitivity were satisfactory and recall agreement was substantial. However, agreement in lesion type and BI-RADS could be improved; further education might be aimed at reducing interobserver variation in interpretation and description of abnormalities. We offered individual feedback on interpretive performance and overall feedback at group level. Future research will determine whether performance has improved.
• We introduced and evaluated a self-test for Dutch breast screening radiologists. • ROC curves, case and lesion sensitivity and recall agreement were all satisfactory. • Agreement in BI-RADS interpretation and description of abnormalities could be improved. • These are areas that should be targeted with further education and training. • We offered individual feedback on interpretative performance and overall group feedback.
评估一项作为国家质量保证计划一部分引入的针对荷兰乳腺筛查放射科医生的自检。
共邀请144名放射科医生完成一组包含60例筛查乳腺X线片(20例恶性病变)的测试。参与者对病变位置、病变类型和乳腺影像报告和数据系统(BI-RADS)等结果进行判定。我们确定了受试者操作特征(ROC)曲线下面积(AUC)、病例及病变的敏感性和特异性、一致性(kappa值)以及阅片者特征与病例敏感性之间的相关性(Spearman相关系数)。
共有110名放射科医生完成了测试(76%)。参与者每年阅读的筛查乳腺X线片数量中位数为10,000例。AUC值中位数为0.93,病例及病变敏感性为91%,病例特异性为94%。我们发现召回(κ = 0.77)和病变侧别(κ = 0.80)方面有高度一致性,病变类型(κ = 0.57)和BI-RADS(κ = 0.45)方面有中度一致性,病例敏感性与阅片者特征之间无相关性。
ROC曲线下面积、病例敏感性和病变敏感性令人满意,召回一致性较高。然而,病变类型和BI-RADS方面的一致性有待提高;进一步的培训可能旨在减少观察者之间对异常情况解释和描述的差异。我们提供了关于解读表现的个人反馈以及小组层面的总体反馈。未来的研究将确定表现是否有所改善。
• 我们引入并评估了一项针对荷兰乳腺筛查放射科医生的自检。• ROC曲线、病例及病变敏感性和召回一致性均令人满意。• 在BI-RADS解读及异常情况描述方面的一致性有待提高。• 这些是应通过进一步教育和培训加以针对性改进的领域。• 我们提供了关于解读表现的个人反馈以及小组总体反馈。