• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

衡量数据质量和可计算表型对2型糖尿病群体预测医疗保健利用中潜在种族差异的影响。

Measuring the Impact of Data Quality and Computable Phenotypes on Potential Racial Disparities in Predicting Healthcare Utilization Among Type 2 Diabetes Populations.

作者信息

Sood Priyanka D, Liu Star, Pandya Chintan, Kalyani Rita R, Lehmann Harold P, Kharrazi Hadi

机构信息

Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA.

Johns Hopkins School of Medicine, Baltimore, MD, USA.

出版信息

J Racial Ethn Health Disparities. 2025 May 27. doi: 10.1007/s40615-025-02485-8.

DOI:10.1007/s40615-025-02485-8
PMID:40425977
Abstract

INTRODUCTION

Type 2 diabetes (T2D) computable phenotypes identify different denominator populations for downstream tasks. Differences in racial composition could introduce bias and lead to disparate disease management. The objective of this study was to assess potential racial disparities in predicting T2D healthcare utilization introduced by data quality and computable phenotypes.

METHODS

Four published and one local T2D phenotypes were applied to the EHR and claims datasets of a large academic medical center. Population characteristics were compared across phenotypes, stratified by race. We induced data incompleteness, inaccuracy, and untimeliness to measure the impact on denominator racial composition. We trained logistic classification models on each of the phenotype-specific populations separately and compared disparities in utilization prediction (i.e., inpatients (IP) and emergency room (ER) admissions). Model performance, such as mean AUC and positive/negative predictive values, were compared across phenotypes, stratified by race.

RESULTS

Different T2D computable phenotypes identified populations with modestly different racial compositions. Black T2D patients had the highest average admissions to ER compared to other racial groups. Induced data quality challenges diminished patient counts across all racial groups proportionally. Charlson comorbidity score had the highest odds ratio in predicting IP and ER admissions across phenotypes and race groups. Specific T2D phenotypes showed the highest and lowest mean AUCs in predicting IP and ER admissions in Black and White populations; however, such results were not observed among Asian/Other populations.

CONCLUSION

Utilization prediction differed among phenotypes and race groups. Understanding the complexities behind phenotypes, data quality, and predictive models could mitigate health disparity further downstream and inform clinical research and disease management.

摘要

引言

2型糖尿病(T2D)可计算表型为下游任务确定了不同的分母人群。种族构成的差异可能会引入偏差并导致疾病管理的差异。本研究的目的是评估数据质量和可计算表型在预测T2D医疗保健利用率方面潜在的种族差异。

方法

将四种已发表的和一种本地的T2D表型应用于一家大型学术医疗中心的电子健康记录(EHR)和理赔数据集。按种族分层,比较各表型的人群特征。我们引入数据不完整、不准确和不及时的情况,以衡量对分母种族构成的影响。我们分别在每个特定表型的人群上训练逻辑分类模型,并比较利用率预测(即住院患者(IP)和急诊室(ER)入院)中的差异。按种族分层,比较各表型之间的模型性能,如平均AUC以及阳性/阴性预测值。

结果

不同的T2D可计算表型确定的人群种族构成略有不同。与其他种族群体相比,黑人T2D患者的急诊室平均入院率最高。引入的数据质量挑战按比例减少了所有种族群体的患者数量。在预测各表型和种族群体的住院患者和急诊室入院方面,查尔森合并症评分的优势比最高。特定的T2D表型在预测黑人和白人人群的住院患者和急诊室入院方面显示出最高和最低的平均AUC;然而,在亚洲/其他人群中未观察到此类结果。

结论

利用率预测在表型和种族群体之间存在差异。了解表型、数据质量和预测模型背后的复杂性,可以在更下游减轻健康差异,并为临床研究和疾病管理提供信息。

相似文献

1
Measuring the Impact of Data Quality and Computable Phenotypes on Potential Racial Disparities in Predicting Healthcare Utilization Among Type 2 Diabetes Populations.衡量数据质量和可计算表型对2型糖尿病群体预测医疗保健利用中潜在种族差异的影响。
J Racial Ethn Health Disparities. 2025 May 27. doi: 10.1007/s40615-025-02485-8.
2
Assessing the Effect of Electronic Health Record Data Quality on Identifying Patients With Type 2 Diabetes: Cross-Sectional Study.评估电子健康记录数据质量对识别2型糖尿病患者的影响:横断面研究。
JMIR Med Inform. 2024 Aug 27;12:e56734. doi: 10.2196/56734.
3
Challenges in replicating secondary analysis of electronic health records data with multiple computable phenotypes: A case study on methicillin-resistant Staphylococcus aureus bacteremia infections.电子健康记录数据的多重可计算表型二次分析中的挑战:以耐甲氧西林金黄色葡萄球菌菌血症感染为例的研究。
Int J Med Inform. 2021 Sep;153:104531. doi: 10.1016/j.ijmedinf.2021.104531. Epub 2021 Jul 16.
4
Fairness in Predicting Cancer Mortality Across Racial Subgroups.预测不同种族亚组癌症死亡率的公平性。
JAMA Netw Open. 2024 Jul 1;7(7):e2421290. doi: 10.1001/jamanetworkopen.2024.21290.
5
6
Evaluating machine learning model bias and racial disparities in non-small cell lung cancer using SEER registry data.利用监测、流行病学和最终结果(SEER)登记数据评估非小细胞肺癌中机器学习模型的偏差和种族差异。
Health Care Manag Sci. 2024 Dec;27(4):631-649. doi: 10.1007/s10729-024-09691-6. Epub 2024 Nov 4.
7
Development of a computable phenotype using electronic health records for venous thromboembolism in medical inpatients: the Medical Inpatient Thrombosis and Hemostasis study.利用电子健康记录开发用于内科住院患者静脉血栓栓塞的可计算表型:内科住院患者血栓形成与止血研究
Res Pract Thromb Haemost. 2023 Apr 24;7(4):100162. doi: 10.1016/j.rpth.2023.100162. eCollection 2023 May.
8
Use of Structured Electronic Health Records Data Elements for the Development of Computable Phenotypes to Identify Potential Adverse Events Associated with Intravenous Immunoglobulin Infusion.利用结构化电子健康记录数据元素开发可计算表型,以识别与静脉注射免疫球蛋白输注相关的潜在不良事件。
Drug Saf. 2023 Mar;46(3):309-318. doi: 10.1007/s40264-023-01276-6. Epub 2023 Feb 24.
9
10
Burden of liver cancer mortality by county, race, and ethnicity in the USA, 2000-19: a systematic analysis of health disparities.美国 2000-19 年按县、种族和族裔划分的肝癌死亡率负担:健康差距的系统分析。
Lancet Public Health. 2024 Mar;9(3):e186-e198. doi: 10.1016/S2468-2667(24)00002-1.

本文引用的文献

1
Assessing Patient and Community-Level Social Factors; The Synergistic Effect of Social Needs and Social Determinants of Health on Healthcare Utilization at a Multilevel Academic Healthcare System.评估患者和社区层面的社会因素;社会需求和健康社会决定因素在多层次学术医疗保健系统中对医疗保健利用的协同作用。
J Med Syst. 2023 Sep 1;47(1):95. doi: 10.1007/s10916-023-01990-9.
2
Comprehensive validation of fasting-based and oral glucose tolerance test-based indices of insulin secretion against gold standard measures.全面验证基于空腹和口服葡萄糖耐量试验的胰岛素分泌指数与金标准测量方法的相关性。
BMJ Open Diabetes Res Care. 2022 Sep;10(5). doi: 10.1136/bmjdrc-2022-002909.
3
Impact of Social Needs in Electronic Health Records and Claims on Health Care Utilization and Costs Risk-Adjustment Models Within Medicaid Population.
医疗补助人群中电子健康记录和理赔中的社会需求对医疗保健利用及成本风险调整模型的影响
Popul Health Manag. 2022 Oct;25(5):658-668. doi: 10.1089/pop.2022.0069. Epub 2022 Jun 23.
4
A bias evaluation checklist for predictive models and its pilot application for 30-day hospital readmission models.预测模型的偏倚评估清单及其在 30 天住院再入院模型中的初步应用。
J Am Med Inform Assoc. 2022 Jul 12;29(8):1323-1333. doi: 10.1093/jamia/ocac065.
5
Development and assessment of a natural language processing model to identify residential instability in electronic health records' unstructured data: a comparison of 3 integrated healthcare delivery systems.开发和评估一种用于识别电子健康记录非结构化数据中居住不稳定情况的自然语言处理模型:对3个综合医疗服务系统的比较
JAMIA Open. 2022 Feb 16;5(1):ooac006. doi: 10.1093/jamiaopen/ooac006. eCollection 2022 Apr.
6
Electronic Health Record-Based Risk Stratification: A Potential Key Ingredient to Achieving Value-Based Care.基于电子健康记录的风险分层:实现价值医疗的潜在关键要素。
Popul Health Manag. 2021 Dec;24(6):654-656. doi: 10.1089/pop.2021.0131. Epub 2021 Jun 14.
7
Social and Behavioral Variables in the Electronic Health Record: A Path Forward to Increase Data Quality and Utility.电子健康记录中的社会和行为变量:提高数据质量和实用性的途径。
Acad Med. 2021 Jul 1;96(7):1050-1056. doi: 10.1097/ACM.0000000000004071.
8
Comparing the Predictive Effects of Patient Medication Adherence Indices in Electronic Health Record and Claims-Based Risk Stratification Models.比较电子健康记录和基于理赔的风险分层模型中患者用药依从性指标的预测效果。
Popul Health Manag. 2021 Oct;24(5):601-609. doi: 10.1089/pop.2020.0306. Epub 2021 Feb 5.
9
Quality assessment of real-world data repositories across the data life cycle: A literature review.贯穿数据生命周期的真实世界数据存储库质量评估:文献综述。
J Am Med Inform Assoc. 2021 Jul 14;28(7):1591-1599. doi: 10.1093/jamia/ocaa340.
10
Dissecting racial bias in an algorithm used to manage the health of populations.剖析用于管理人群健康的算法中的种族偏见。
Science. 2019 Oct 25;366(6464):447-453. doi: 10.1126/science.aax2342.