• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

评估电子健康记录数据质量对识别2型糖尿病患者的影响:横断面研究。

Assessing the Effect of Electronic Health Record Data Quality on Identifying Patients With Type 2 Diabetes: Cross-Sectional Study.

作者信息

Sood Priyanka Dua, Liu Star, Lehmann Harold, Kharrazi Hadi

机构信息

Bloomberg School of Public Health, Johns Hopkins University, 615 N Wolfe St, Baltimore, MD, 21205, United States, 1 443-287-8264.

School of Medicine, Johns Hopkins University, Baltimore, MD, United States.

出版信息

JMIR Med Inform. 2024 Aug 27;12:e56734. doi: 10.2196/56734.

DOI:10.2196/56734
PMID:39189917
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11370182/
Abstract

BACKGROUND

Increasing and substantial reliance on electronic health records (EHRs) and data types (ie, diagnosis, medication, and laboratory data) demands assessment of their data quality as a fundamental approach, especially since there is a need to identify appropriate denominator populations with chronic conditions, such as type 2 diabetes (T2D), using commonly available computable phenotype definitions (ie, phenotypes).

OBJECTIVE

To bridge this gap, our study aims to assess how issues of EHR data quality and variations and robustness (or lack thereof) in phenotypes may have potential impacts in identifying denominator populations.

METHODS

Approximately 208,000 patients with T2D were included in our study, which used retrospective EHR data from the Johns Hopkins Medical Institution (JHMI) during 2017-2019. Our assessment included 4 published phenotypes and 1 definition from a panel of experts at Hopkins. We conducted descriptive analyses of demographics (ie, age, sex, race, and ethnicity), use of health care (inpatient and emergency room visits), and the average Charlson Comorbidity Index score of each phenotype. We then used different methods to induce or simulate data quality issues of completeness, accuracy, and timeliness separately across each phenotype. For induced data incompleteness, our model randomly dropped diagnosis, medication, and laboratory codes independently at increments of 10%; for induced data inaccuracy, our model randomly replaced a diagnosis or medication code with another code of the same data type and induced 2% incremental change from -100% to +10% in laboratory result values; and lastly, for timeliness, data were modeled for induced incremental shift of date records by 30 days to 365 days.

RESULTS

Less than a quarter (n=47,326, 23%) of the population overlapped across all phenotypes using EHRs. The population identified by each phenotype varied across all combinations of data types. Induced incompleteness identified fewer patients with each increment; for example, at 100% diagnostic incompleteness, the Chronic Conditions Data Warehouse phenotype identified zero patients, as its phenotypic characteristics included only diagnosis codes. Induced inaccuracy and timeliness similarly demonstrated variations in performance of each phenotype, therefore resulting in fewer patients being identified with each incremental change.

CONCLUSIONS

We used EHR data with diagnosis, medication, and laboratory data types from a large tertiary hospital system to understand T2D phenotypic differences and performance. We used induced data quality methods to learn how data quality issues may impact identification of the denominator populations upon which clinical (eg, clinical research and trials, population health evaluations) and financial or operational decisions are made. The novel results from our study may inform future approaches to shaping a common T2D computable phenotype definition that can be applied to clinical informatics, managing chronic conditions, and additional industry-wide efforts in health care.

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6870/11370182/3c2519519601/medinform-v12-e56734-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6870/11370182/e99f833a8d49/medinform-v12-e56734-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6870/11370182/544adcd7760f/medinform-v12-e56734-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6870/11370182/615021e8e80f/medinform-v12-e56734-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6870/11370182/a12f2e76327f/medinform-v12-e56734-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6870/11370182/607e7d3398cb/medinform-v12-e56734-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6870/11370182/50bb138bf8be/medinform-v12-e56734-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6870/11370182/82619053fd2a/medinform-v12-e56734-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6870/11370182/eec7cdb2397a/medinform-v12-e56734-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6870/11370182/3c2519519601/medinform-v12-e56734-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6870/11370182/e99f833a8d49/medinform-v12-e56734-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6870/11370182/544adcd7760f/medinform-v12-e56734-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6870/11370182/615021e8e80f/medinform-v12-e56734-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6870/11370182/a12f2e76327f/medinform-v12-e56734-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6870/11370182/607e7d3398cb/medinform-v12-e56734-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6870/11370182/50bb138bf8be/medinform-v12-e56734-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6870/11370182/82619053fd2a/medinform-v12-e56734-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6870/11370182/eec7cdb2397a/medinform-v12-e56734-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6870/11370182/3c2519519601/medinform-v12-e56734-g009.jpg
摘要

背景

对电子健康记录(EHR)及数据类型(即诊断、用药和实验室数据)的依赖日益增加且程度可观,这就需要将数据质量评估作为一种基本方法,特别是因为有必要使用常见的可计算表型定义(即表型)来识别患有慢性病(如2型糖尿病(T2D))的合适分母人群。

目的

为弥补这一差距,我们的研究旨在评估EHR数据质量问题以及表型的变异性和稳健性(或缺乏稳健性)如何可能对识别分母人群产生潜在影响。

方法

我们的研究纳入了约208,000例T2D患者,使用了约翰霍普金斯医疗机构(JHMI)2017 - 2019年的回顾性EHR数据。我们的评估包括4种已发表的表型和来自霍普金斯专家小组的1种定义。我们对人口统计学特征(即年龄、性别、种族和民族)、医疗保健使用情况(住院和急诊就诊)以及每种表型的平均查尔森合并症指数得分进行了描述性分析。然后,我们使用不同方法分别针对每种表型诱导或模拟完整性、准确性和及时性方面的数据质量问题。对于诱导的数据不完整性,我们的模型以10%的增量独立随机删除诊断、用药和实验室代码;对于诱导的数据不准确,我们的模型用相同数据类型的另一个代码随机替换诊断或用药代码,并在实验室结果值中诱导从 - 100%到 + 10%的2%增量变化;最后,对于及时性,对数据进行建模,以诱导日期记录的增量偏移30天至365天。

结果

使用EHR的情况下,所有表型中重叠的人群不到四分之一(n = 47,326,23%)。每种表型识别出的人群在所有数据类型组合中各不相同。诱导的不完整性随着每次增量识别出的患者减少;例如,在100%诊断不完整时,慢性病数据仓库表型识别出零患者,因为其表型特征仅包括诊断代码。诱导的不准确和及时性同样显示出每种表型性能的差异,因此每次增量变化识别出的患者也更少。

结论

我们使用了来自大型三级医院系统的包含诊断、用药和实验室数据类型的EHR数据,以了解T2D表型差异和性能。我们使用诱导数据质量方法来了解数据质量问题如何可能影响分母人群的识别,而临床(如临床研究和试验、人群健康评估)以及财务或运营决策都是基于这些分母人群做出的。我们研究的新结果可能为未来塑造通用的T2D可计算表型定义的方法提供参考,该定义可应用于临床信息学、慢性病管理以及医疗保健领域的其他全行业努力。

相似文献

1
Assessing the Effect of Electronic Health Record Data Quality on Identifying Patients With Type 2 Diabetes: Cross-Sectional Study.评估电子健康记录数据质量对识别2型糖尿病患者的影响:横断面研究。
JMIR Med Inform. 2024 Aug 27;12:e56734. doi: 10.2196/56734.
2
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
3
Accuracy of Computable Phenotyping Approaches for SARS-CoV-2 Infection and COVID-19 Hospitalizations from the Electronic Health Record.基于电子健康记录的新冠病毒感染和新冠住院可计算表型分析方法的准确性
medRxiv. 2021 May 13:2021.03.16.21253770. doi: 10.1101/2021.03.16.21253770.
4
Use of Structured Electronic Health Records Data Elements for the Development of Computable Phenotypes to Identify Potential Adverse Events Associated with Intravenous Immunoglobulin Infusion.利用结构化电子健康记录数据元素开发可计算表型,以识别与静脉注射免疫球蛋白输注相关的潜在不良事件。
Drug Saf. 2023 Mar;46(3):309-318. doi: 10.1007/s40264-023-01276-6. Epub 2023 Feb 24.
5
Optimizing research in symptomatic uterine fibroids with development of a computable phenotype for use with electronic health records.优化有症状的子宫纤维瘤的研究,开发可计算的表型,用于电子健康记录。
Am J Obstet Gynecol. 2018 Jun;218(6):610.e1-610.e7. doi: 10.1016/j.ajog.2018.02.002. Epub 2018 Feb 9.
6
7
Electronic Health Records for Population Health Management: Comparison of Electronic Health Record-Derived Hypertension Prevalence Measures Against Established Survey Data.用于人群健康管理的电子健康记录:基于电子健康记录得出的高血压患病率测量值与既定调查数据的比较
Online J Public Health Inform. 2024 Mar 13;16:e48300. doi: 10.2196/48300.
8
Factors Influencing Data Quality in Electronic Health Record Systems in 50 Health Facilities in Rwanda and the Role of Clinical Alerts: Cross-Sectional Observational Study.卢旺达 50 家卫生机构中电子健康记录系统数据质量的影响因素和临床警报的作用:横断面观察性研究。
JMIR Public Health Surveill. 2024 Jul 3;10:e49127. doi: 10.2196/49127.
9
Erratum: High-Throughput Identification of Resistance to Pseudomonas syringae pv. Tomato in Tomato using Seedling Flood Assay.勘误:利用幼苗浸没法高通量鉴定番茄对丁香假单胞菌 pv.番茄的抗性。
J Vis Exp. 2023 Oct 18(200). doi: 10.3791/6576.
10
[Standard technical specifications for methacholine chloride (Methacholine) bronchial challenge test (2023)].[氯化乙酰甲胆碱支气管激发试验标准技术规范(2023年)]
Zhonghua Jie He He Hu Xi Za Zhi. 2024 Feb 12;47(2):101-119. doi: 10.3760/cma.j.cn112147-20231019-00247.

引用本文的文献

1
A probabilistic approach for building disease phenotypes across electronic health records.一种基于电子健康记录构建疾病表型的概率方法。
BioData Min. 2025 Jun 11;18(1):39. doi: 10.1186/s13040-025-00454-9.
2
Factors Influencing Information Distortion in Electronic Nursing Records: Qualitative Study.影响电子护理记录中信息失真的因素:定性研究
J Med Internet Res. 2025 Apr 9;27:e66959. doi: 10.2196/66959.
3
Assessing the impact of social determinants of health on diabetes severity and management.评估健康的社会决定因素对糖尿病严重程度及管理的影响。

本文引用的文献

1
A comparison of phenotype definitions for diabetes mellitus.糖尿病表型定义的比较。
J Am Med Inform Assoc. 2013 Dec;20(e2):e319-26. doi: 10.1136/amiajnl-2013-001952. Epub 2013 Sep 11.
2
Methods and dimensions of electronic health record data quality assessment: enabling reuse for clinical research.电子健康记录数据质量评估的方法和维度:为临床研究提供可重用性。
J Am Med Inform Assoc. 2013 Jan 1;20(1):144-51. doi: 10.1136/amiajnl-2011-000681. Epub 2012 Jun 25.
3
Use of diverse electronic medical record systems to identify genetic risk for type 2 diabetes within a genome-wide association study.
JAMIA Open. 2024 Oct 25;7(4):ooae107. doi: 10.1093/jamiaopen/ooae107. eCollection 2024 Dec.
利用多种电子病历系统在全基因组关联研究中识别 2 型糖尿病的遗传风险。
J Am Med Inform Assoc. 2012 Mar-Apr;19(2):212-8. doi: 10.1136/amiajnl-2011-000439. Epub 2011 Nov 19.
4
Accuracy of data in computer-based patient records.基于计算机的患者记录中数据的准确性。
J Am Med Inform Assoc. 1997 Sep-Oct;4(5):342-55. doi: 10.1136/jamia.1997.0040342.