Yawn Barbara P, Wollan Peter
Department of Research, Olmsted Medical Center, Rochester, MN 55904, USA.
Am J Epidemiol. 2005 May 15;161(10):974-7. doi: 10.1093/aje/kwi122.
In medical records review studies, information on the interrater reliability (IRR) of the data is seldom reported. This study assesses the IRR of data collected for a complex medical records review study. Elements selected for determining IRR included "demographic" data that require copying explicit information (e.g., gender, birth date), "free-text" data that require identifying and copying (e.g., chief complaints and diagnoses), and data that require abstractor judgment in determining what to record (e.g., whether heart disease was considered). Rates of agreement were assessed by the greatest number of answers (one to all n) that were the same. The IRR scores improved over time. At 1 month, the reliability for demographic data elements was very good, for free-text data elements was good, but for data elements requiring abstractor judgment was unacceptable (only 3.4 of six answers agreed, on average). All assessments after 6 months showed very good to excellent IRR. This study demonstrates that IRR can be evaluated and summarized, providing important information to the study investigators and to the consumer for assessing the reliability of the data and therefore the validity of the study results and conclusions. IRR information should be required for all large medical records studies.
在病历回顾研究中,关于数据的评估者间信度(IRR)的信息很少被报告。本研究评估了为一项复杂的病历回顾研究收集的数据的IRR。用于确定IRR的选定要素包括需要抄录明确信息的“人口统计学”数据(如性别、出生日期)、需要识别和抄录的“自由文本”数据(如主要症状和诊断),以及在确定记录内容时需要摘要员判断的数据(如是否考虑了心脏病)。通过相同答案的最大数量(从一个到所有n个)来评估一致率。IRR分数随时间有所提高。在1个月时,人口统计学数据要素的信度非常好,自由文本数据要素的信度良好,但需要摘要员判断的数据要素的信度不可接受(平均六个答案中只有3.4个一致)。6个月后的所有评估均显示IRR非常好至优秀。本研究表明,可以对IRR进行评估和总结,为研究调查人员和消费者提供重要信息,以评估数据的可靠性,进而评估研究结果和结论的有效性。所有大型病历研究都应要求提供IRR信息。