• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用数据质量框架清理从电子健康记录中提取的数据:一个案例研究。

Using a Data Quality Framework to Clean Data Extracted from the Electronic Health Record: A Case Study.

作者信息

Dziadkowiec Oliwier, Callahan Tiffany, Ozkaynak Mustafa, Reeder Blaine, Welton John

机构信息

University of Colorado, College of Nursing, Anschutz Medical Campus.

University of Colorado, Department of Pediatrics, Anschutz Medical Campus.

出版信息

EGEMS (Wash DC). 2016 Jun 24;4(1):1201. doi: 10.13063/2327-9214.1201. eCollection 2016.

DOI:10.13063/2327-9214.1201
PMID:27429992
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4933574/
Abstract

OBJECTIVES

We examine the following: (1) the appropriateness of using a data quality (DQ) framework developed for relational databases as a data-cleaning tool for a data set extracted from two EPIC databases, and (2) the differences in statistical parameter estimates on a data set cleaned with the DQ framework and data set not cleaned with the DQ framework.

BACKGROUND

The use of data contained within electronic health records (EHRs) has the potential to open doors for a new wave of innovative research. Without adequate preparation of such large data sets for analysis, the results might be erroneous, which might affect clinical decision-making or the results of Comparative Effectives Research studies.

METHODS

Two emergency department (ED) data sets extracted from EPIC databases (adult ED and children ED) were used as examples for examining the five concepts of DQ based on a DQ assessment framework designed for EHR databases. The first data set contained 70,061 visits; and the second data set contained 2,815,550 visits. SPSS Syntax examples as well as step-by-step instructions of how to apply the five key DQ concepts these EHR database extracts are provided.

CONCLUSIONS

SPSS Syntax to address each of the DQ concepts proposed by Kahn et al. (2012)1 was developed. The data set cleaned using Kahn's framework yielded more accurate results than the data set cleaned without this framework. Future plans involve creating functions in R language for cleaning data extracted from the EHR as well as an R package that combines DQ checks with missing data analysis functions.

摘要

目的

我们研究以下内容:(1)将为关系数据库开发的数据质量(DQ)框架用作从两个EPIC数据库提取的数据集的数据清理工具是否合适,以及(2)使用DQ框架清理的数据集与未使用DQ框架清理的数据集在统计参数估计上的差异。

背景

使用电子健康记录(EHR)中包含的数据有可能为新一轮创新研究打开大门。如果没有对如此大的数据集进行充分准备以进行分析,结果可能会出错,这可能会影响临床决策或比较效果研究的结果。

方法

从EPIC数据库提取的两个急诊科(ED)数据集(成人ED和儿童ED)用作示例,基于为EHR数据库设计的DQ评估框架来检验DQ的五个概念。第一个数据集包含70,061次就诊记录;第二个数据集包含2,815,550次就诊记录。提供了SPSS语法示例以及如何将这五个关键DQ概念应用于这些EHR数据库提取物的逐步说明。

结论

开发了用于处理Kahn等人(2012年)[1]提出的每个DQ概念的SPSS语法。使用Kahn框架清理的数据集比未使用此框架清理的数据集产生更准确的结果。未来计划包括用R语言创建用于清理从EHR提取的数据的函数,以及一个将DQ检查与缺失数据分析函数结合起来的R包。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45f6/4933574/3baaca1854dd/egems1201f8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45f6/4933574/2b3b6bfe1696/egems1201f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45f6/4933574/eba342b097ef/egems1201f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45f6/4933574/61331f3e0ee2/egems1201f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45f6/4933574/143a816841b4/egems1201f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45f6/4933574/d14639f5a9fa/egems1201f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45f6/4933574/50a6f7266858/egems1201f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45f6/4933574/25e834a23468/egems1201f7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45f6/4933574/3baaca1854dd/egems1201f8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45f6/4933574/2b3b6bfe1696/egems1201f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45f6/4933574/eba342b097ef/egems1201f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45f6/4933574/61331f3e0ee2/egems1201f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45f6/4933574/143a816841b4/egems1201f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45f6/4933574/d14639f5a9fa/egems1201f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45f6/4933574/50a6f7266858/egems1201f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45f6/4933574/25e834a23468/egems1201f7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45f6/4933574/3baaca1854dd/egems1201f8.jpg

相似文献

1
Using a Data Quality Framework to Clean Data Extracted from the Electronic Health Record: A Case Study.使用数据质量框架清理从电子健康记录中提取的数据:一个案例研究。
EGEMS (Wash DC). 2016 Jun 24;4(1):1201. doi: 10.13063/2327-9214.1201. eCollection 2016.
2
A Harmonized Data Quality Assessment Terminology and Framework for the Secondary Use of Electronic Health Record Data.电子健康记录数据二次使用的统一数据质量评估术语和框架。
EGEMS (Wash DC). 2016 Sep 11;4(1):1244. doi: 10.13063/2327-9214.1244. eCollection 2016.
3
Initializing a hospital-wide data quality program. The AP-HP experience.启动全院范围的数据质量计划。AP-HP 的经验。
Comput Methods Programs Biomed. 2019 Nov;181:104804. doi: 10.1016/j.cmpb.2018.10.016. Epub 2018 Nov 9.
4
Linking a Consortium-Wide Data Quality Assessment Tool with the MIRACUM Metadata Repository.将联盟范围内的数据质量评估工具与 MIRACUM 元数据存储库相链接。
Appl Clin Inform. 2021 Aug;12(4):826-835. doi: 10.1055/s-0041-1733847. Epub 2021 Aug 25.
5
Moving Towards an EHR Data Quality Framework: The MIRACUM Approach.迈向电子健康记录数据质量框架:MIRACUM方法。
Stud Health Technol Inform. 2019 Sep 3;267:247-253. doi: 10.3233/SHTI190834.
6
A method for cohort selection of cardiovascular disease records from an electronic health record system.一种从电子健康记录系统中选择心血管疾病记录队列的方法。
Int J Med Inform. 2017 Jun;102:138-149. doi: 10.1016/j.ijmedinf.2017.03.015. Epub 2017 Mar 30.
7
DQAgui: a graphical user interface for the MIRACUM data quality assessment tool.DQAgui:MIRACUM 数据质量评估工具的图形用户界面。
BMC Med Inform Decis Mak. 2022 Aug 11;22(1):213. doi: 10.1186/s12911-022-01961-z.
8
Assessing the practice of data quality evaluation in a national clinical data research network through a systematic scoping review in the era of real-world data.在真实世界数据时代,通过系统的范围综述评估国家临床数据研究网络中的数据质量评估实践。
J Am Med Inform Assoc. 2020 Dec 9;27(12):1999-2010. doi: 10.1093/jamia/ocaa245.
9
Developing a systematic approach to assessing data quality in secondary use of clinical data based on intended use.基于预期用途,制定一种系统的方法来评估临床数据二次使用中的数据质量。
Learn Health Syst. 2021 May 3;6(1):e10264. doi: 10.1002/lrh2.10264. eCollection 2022 Jan.
10
DQ-v: A Database-Agnostic Framework for Exploring Variability in Electronic Health Record Data Across Time and Site Location.DQ-v:一个与数据库无关的框架,用于探索电子健康记录数据随时间和地点的变异性。
EGEMS (Wash DC). 2017 May 10;5(1):3. doi: 10.13063/2327-9214.1277.

引用本文的文献

1
A Standardized Guideline for Assessing Extracted Electronic Health Records Cohorts: A Scoping Review.评估提取的电子健康记录队列的标准化指南:一项范围综述。
AMIA Jt Summits Transl Sci Proc. 2025 Jun 10;2025:527-536. eCollection 2025.
2
Data Quality-Driven Improvement in Health Care: Systematic Literature Review.数据驱动的医疗质量改进:系统文献回顾。
J Med Internet Res. 2024 Aug 22;26:e57615. doi: 10.2196/57615.
3
A cohort study of engagement in telehealth psychotherapy versus in-person services.一项关于远程健康心理治疗与面对面服务参与情况的队列研究。

本文引用的文献

1
Transparent reporting of data quality in distributed data networks.分布式数据网络中数据质量的透明报告。
EGEMS (Wash DC). 2015 Mar 23;3(1):1052. doi: 10.13063/2327-9214.1052. eCollection 2015.
2
Availability of structured and unstructured clinical data for comparative effectiveness research and quality improvement: a multisite assessment.用于比较效果研究和质量改进的结构化与非结构化临床数据的可获取性:一项多中心评估
EGEMS (Wash DC). 2014 Jul 11;2(1):1079. doi: 10.13063/2327-9214.1079. eCollection 2014.
3
Preparing Electronic Clinical Data for Quality Improvement and Comparative Effectiveness Research: The SCOAP CERTAIN Automation and Validation Project.
Psychother Res. 2024 Jul 21:1-11. doi: 10.1080/10503307.2024.2375231.
4
Systematically assessing the quality of dental electronic health record data for an investigation into oral health care disparities.系统评估口腔健康护理差距研究中电子健康记录数据的质量。
J Public Health Dent. 2024 Sep;84(3):242-250. doi: 10.1111/jphd.12618. Epub 2024 Apr 24.
5
Electronic health record data quality assessment and tools: a systematic review.电子健康记录数据质量评估及工具:系统综述。
J Am Med Inform Assoc. 2023 Sep 25;30(10):1730-1740. doi: 10.1093/jamia/ocad120.
6
Research data warehouse: using electronic health records to conduct population-based observational studies.研究数据仓库:利用电子健康记录开展基于人群的观察性研究。
JAMIA Open. 2023 Jun 21;6(2):ooad039. doi: 10.1093/jamiaopen/ooad039. eCollection 2023 Jul.
7
BAYESIAN ANALYSIS FOR IMBALANCED POSITIVE-UNLABELLED DIAGNOSIS CODES IN ELECTRONIC HEALTH RECORDS.电子健康记录中不平衡阳性-未标记诊断代码的贝叶斯分析
Ann Appl Stat. 2023 Jun;17(2):1220-1238. doi: 10.1214/22-AOAS1666. Epub 2023 May 1.
8
Automating Electronic Health Record Data Quality Assessment.自动化电子健康记录数据质量评估。
J Med Syst. 2023 Feb 13;47(1):23. doi: 10.1007/s10916-022-01892-2.
9
Data report on three datasets: Mortality patterns between agricultural and non-agricultural ward areas.关于三个数据集的数据报告:农业和非农业病房区域之间的死亡率模式。
Front Genet. 2023 Jan 4;13:953167. doi: 10.3389/fgene.2022.953167. eCollection 2022.
10
An automated data cleaning method for Electronic Health Records by incorporating clinical knowledge.结合临床知识的电子健康记录自动化数据清洗方法。
BMC Med Inform Decis Mak. 2021 Sep 17;21(1):267. doi: 10.1186/s12911-021-01630-7.
为质量改进和比较效果研究准备电子临床数据:SCOAP CERTAIN自动化与验证项目
EGEMS (Wash DC). 2013 Sep 10;1(1):1025. doi: 10.13063/2327-9214.1025. eCollection 2013.
4
In Search of a Data-in-Once, Electronic Health Record-Linked, Multicenter Registry-How Far We Have Come and How Far We Still Have to Go.寻找一个一次性录入数据、与电子健康记录相关联的多中心注册库——我们已经走了多远以及仍需走多远。
EGEMS (Wash DC). 2013 Jan 17;1(1):1003. doi: 10.13063/2327-9214.1003. eCollection 2013.
5
Protocol for Reducing Time to Antibiotics in Pediatric Patients Presenting to an Emergency Department With Fever and Neutropenia: Efficacy and Barriers.减少发热伴中性粒细胞减少的儿科急诊患者抗生素使用时间的方案:疗效与障碍
Pediatr Emerg Care. 2016 Nov;32(11):739-745. doi: 10.1097/PEC.0000000000000362.
6
How the provenance of electronic health record data matters for research: a case example using system mapping.电子健康记录数据的来源对研究为何重要:一个使用系统映射的案例
EGEMS (Wash DC). 2014 Apr 16;2(1):1058. doi: 10.13063/2327-9214.1058. eCollection 2014.
7
Business intelligence and nursing administration.商业智能与护理管理。
J Nurs Adm. 2014 May;44(5):245-6. doi: 10.1097/NNA.0000000000000060.
8
Electronic health information quality challenges and interventions to improve public health surveillance data and practice.电子健康信息质量挑战及干预措施,以改善公共卫生监测数据和实践。
Public Health Rep. 2013 Nov-Dec;128(6):546-53. doi: 10.1177/003335491312800614.
9
Caveats for the use of operational electronic health record data in comparative effectiveness research.使用操作性电子健康记录数据进行比较有效性研究的注意事项。
Med Care. 2013 Aug;51(8 Suppl 3):S30-7. doi: 10.1097/MLR.0b013e31829b1dbd.
10
Challenges in using electronic health record data for CER: experience of 4 learning organizations and solutions applied.使用电子健康记录数据进行临床效果研究的挑战:4 个学习型组织的经验及应用的解决方案。
Med Care. 2013 Aug;51(8 Suppl 3):S80-6. doi: 10.1097/MLR.0b013e31829b1d48.