Goldberg Saveli I, Niemierko Andrzej, Turchin Alexander
Massachusetts General Hospital, Boston, MA, USA.
AMIA Annu Symp Proc. 2008 Nov 6;2008:242-6.
Errors in clinical research databases are common but relatively little is known about their characteristics and optimal detection and prevention strategies. We have analyzed data from several clinical research databases at a single academic medical center to assess frequency, distribution and features of data entry errors. Error rates detected by the double-entry method ranged from 2.3 to 26.9%. Errors were due to both mistakes in data entry and to misinterpretation of the information in the original documents. Error detection based on data constraint failure significantly underestimated total error rates and constraint-based alarms integrated into the database appear to prevent only a small fraction of errors. Many errors were non-random, organized in special and cognitive clusters, and some could potentially affect the interpretation of the study results. Further investigation is needed into the methods for detection and prevention of data errors in research.
临床研究数据库中的错误很常见,但人们对其特征以及最佳检测和预防策略了解相对较少。我们分析了一家学术医疗中心多个临床研究数据库的数据,以评估数据录入错误的频率、分布和特征。通过双重录入法检测到的错误率在2.3%至26.9%之间。错误既源于数据录入失误,也源于对原始文档信息的错误解读。基于数据约束失败的错误检测显著低估了总错误率,而数据库中集成的基于约束的警报似乎只能预防一小部分错误。许多错误并非随机出现,而是以特殊的认知集群形式存在,有些错误可能会影响研究结果的解读。需要进一步研究检测和预防研究中数据错误的方法。