Medical Image Optimisation and Perception Group (MIOPeG), Faculty of Health Sciences, University of Sydney, Lidcombe, NSW, Australia.
Clin Radiol. 2012 Jul;67(7):623-8. doi: 10.1016/j.crad.2012.02.007. Epub 2012 Apr 7.
The purpose of this article is to review the limitations associated with current methods of assessing reader accuracy in mammography screening programmes. Clinical audit is commonly used as a quality-assurance tool to monitor the performance of screen readers; however, a number of the metrics employed, such as recall rate as a surrogate for specificity, do not always accurately measure the intended clinical feature. Alternatively, standardized screening test sets, which benefit from ease of application, immediacy of results, and quicker assessment of quality improvement plans, suffer from experimental confounders, thus questioning the relevance of these laboratory-type screening test sets to clinical performance. Four key factors that impact on the external validity of screening test sets were identified: the nature and extent of scrutiny of one's action, the artificiality of the environment, the over-simplification of responses, and prevalence of abnormality. The impact of these factors on radiological and other contexts is discussed, and although it is important to acknowledge the benefit of standardized screening test sets, issues relating to the relevance of test sets to clinical activities remain. The degree of correlation between performance based on real-life clinical audit and performances at screen read test sets must be better understood and specific causal agents for any lack of correlation identified.
本文旨在回顾当前评估乳腺摄影筛查计划中读者准确性的方法所存在的局限性。临床审核通常被用作质量保证工具,以监测筛查读者的表现;然而,所采用的一些指标,如召回率作为特异性的替代指标,并不总是能准确地测量预期的临床特征。另一方面,标准化的筛查测试集受益于易于应用、即时结果以及更快地评估质量改进计划,但存在实验性混杂因素,因此这些实验室类型的筛查测试集与临床性能的相关性受到质疑。确定了四个影响筛查测试集外部有效性的关键因素:对自身行为的审查的性质和程度、环境的人为性、反应的过度简化以及异常的普遍性。讨论了这些因素对放射学和其他方面的影响,尽管承认标准化筛查测试集的益处很重要,但与测试集与临床活动的相关性相关的问题仍然存在。必须更好地理解基于实际临床审核的表现与屏幕阅读测试集上的表现之间的相关性程度,并确定任何相关性缺乏的具体原因。