Suppr超能文献

临床研究数据库中的数据错误分析。

Analysis of data errors in clinical research databases.

作者信息

Goldberg Saveli I, Niemierko Andrzej, Turchin Alexander

机构信息

Massachusetts General Hospital, Boston, MA, USA.

出版信息

AMIA Annu Symp Proc. 2008 Nov 6;2008:242-6.

Abstract

Errors in clinical research databases are common but relatively little is known about their characteristics and optimal detection and prevention strategies. We have analyzed data from several clinical research databases at a single academic medical center to assess frequency, distribution and features of data entry errors. Error rates detected by the double-entry method ranged from 2.3 to 26.9%. Errors were due to both mistakes in data entry and to misinterpretation of the information in the original documents. Error detection based on data constraint failure significantly underestimated total error rates and constraint-based alarms integrated into the database appear to prevent only a small fraction of errors. Many errors were non-random, organized in special and cognitive clusters, and some could potentially affect the interpretation of the study results. Further investigation is needed into the methods for detection and prevention of data errors in research.

摘要

临床研究数据库中的错误很常见,但人们对其特征以及最佳检测和预防策略了解相对较少。我们分析了一家学术医疗中心多个临床研究数据库的数据,以评估数据录入错误的频率、分布和特征。通过双重录入法检测到的错误率在2.3%至26.9%之间。错误既源于数据录入失误,也源于对原始文档信息的错误解读。基于数据约束失败的错误检测显著低估了总错误率,而数据库中集成的基于约束的警报似乎只能预防一小部分错误。许多错误并非随机出现,而是以特殊的认知集群形式存在,有些错误可能会影响研究结果的解读。需要进一步研究检测和预防研究中数据错误的方法。

引用本文的文献

1
Rakeiora Genomics Platform: a pathfinder for genomic medicine research in Aotearoa New Zealand.
J R Soc N Z. 2025 Mar 24;55(6):2481-2505. doi: 10.1080/03036758.2025.2469626. eCollection 2025.
2
Uncertainties in outcome modelling in radiation oncology.
Phys Imaging Radiat Oncol. 2025 May 7;34:100774. doi: 10.1016/j.phro.2025.100774. eCollection 2025 Apr.
3
Cytogenetic annotation automation in myelodysplastic syndrome research databases.
Blood Adv. 2025 May 27;9(10):2428-2430. doi: 10.1182/bloodadvances.2024015362.
4
The Venus score for the assessment of the quality and trustworthiness of biomedical datasets.
BioData Min. 2025 Jan 9;18(1):1. doi: 10.1186/s13040-024-00412-x.
5
Using Large Language Models to Extract Core Injury Information From Emergency Department Notes.
J Korean Med Sci. 2024 Dec 2;39(46):e291. doi: 10.3346/jkms.2024.39.e291.
6
7
Local Validation of a National Orthopaedic Registry.
Cureus. 2024 Mar 6;16(3):e55636. doi: 10.7759/cureus.55636. eCollection 2024 Mar.
10
Gender Disparities in Depression, Stress, and Social Support Among Glaucoma Patients.
Transl Vis Sci Technol. 2023 Dec 1;12(12):23. doi: 10.1167/tvst.12.12.23.

本文引用的文献

1
Improving the quality of data entry in a low-budget head injury database.
Acta Neurochir (Wien). 2007;149(9):903-9. doi: 10.1007/s00701-007-1257-3. Epub 2007 Jul 31.
2
Handheld computers for data entry: high tech has its problems too.
Trials. 2007 Feb 20;8:5. doi: 10.1186/1745-6215-8-5.
3
A comparison of error detection rates between the reading aloud method and the double data entry method.
Control Clin Trials. 2003 Oct;24(5):560-9. doi: 10.1016/s0197-2456(03)00089-8.
5
Epidemiology of medical error.
BMJ. 2000 Mar 18;320(7237):774-7. doi: 10.1136/bmj.320.7237.774.
6
The quality of abstracting medical information from the medical record: the impact of training programmes.
Int J Qual Health Care. 1999 Jun;11(3):209-13. doi: 10.1093/intqhc/11.3.209.
7
APACHE II, data accuracy and outcome prediction.
Anaesthesia. 1998 Oct;53(10):937-43. doi: 10.1046/j.1365-2044.1998.00534.x.
8
Double data entry: what value, what price?
Control Clin Trials. 1998 Feb;19(1):15-24. doi: 10.1016/s0197-2456(97)00096-2.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验