Hanskamp-Sebregts Mirelle, Zegers Marieke, Vincent Charles, van Gurp Petra J, de Vet Henrica C W, Wollersheim Hub
Radboud University Medical Center, Institute of Quality Assurance and Patient Safety, Nijmegen, The Netherlands.
Radboud University Medical Center, Radboud Institute for Health Sciences, IQ healthcare, Nijmegen, The Netherlands.
BMJ Open. 2016 Aug 22;6(8):e011078. doi: 10.1136/bmjopen-2016-011078.
Record review is the most used method to quantify patient safety. We systematically reviewed the reliability and validity of adverse event detection with record review.
A systematic review of the literature.
We searched PubMed, EMBASE, CINAHL, PsycINFO and the Cochrane Library and from their inception through February 2015. We included all studies that aimed to describe the reliability and/or validity of record review. Two reviewers conducted data extraction. We pooled κ values (κ) and analysed the differences in subgroups according to number of reviewers, reviewer experience and training level, adjusted for the prevalence of adverse events.
In 25 studies, the psychometric data of the Global Trigger Tool (GTT) and the Harvard Medical Practice Study (HMPS) were reported and 24 studies were included for statistical pooling. The inter-rater reliability of the GTT and HMPS showed a pooled κ of 0.65 and 0.55, respectively. The inter-rater agreement was statistically significantly higher when the group of reviewers within a study consisted of a maximum five reviewers. We found no studies reporting on the validity of the GTT and HMPS.
The reliability of record review is moderate to substantial and improved when a small group of reviewers carried out record review. The validity of the record review method has never been evaluated, while clinical data registries, autopsy or direct observations of patient care are potential reference methods that can be used to test concurrent validity.
病历审查是量化患者安全的最常用方法。我们系统地回顾了通过病历审查检测不良事件的可靠性和有效性。
对文献进行系统回顾。
我们检索了PubMed、EMBASE、CINAHL、PsycINFO和Cochrane图书馆,检索时间从各数据库建立至2015年2月。我们纳入了所有旨在描述病历审查可靠性和/或有效性的研究。两名评审员进行数据提取。我们汇总κ值(κ),并根据评审员数量、评审员经验和培训水平分析亚组差异,对不良事件的发生率进行校正。
25项研究报告了全球触发工具(GTT)和哈佛医疗实践研究(HMPS)的心理测量数据,24项研究纳入统计汇总。GTT和HMPS的评分者间信度分别显示汇总κ值为0.65和0.55。当一项研究中的评审员组最多由五名评审员组成时,评分者间一致性在统计学上显著更高。我们未发现有研究报告GTT和HMPS的有效性。
病历审查的可靠性为中等至较高,当由一小群评审员进行病历时审查可靠性会提高。病历审查方法的有效性从未得到评估,而临床数据登记、尸检或对患者护理的直接观察是可用于检验同时效度的潜在参考方法。