Suppr超能文献

衡量数据质量和可计算表型对2型糖尿病群体预测医疗保健利用中潜在种族差异的影响。

Measuring the Impact of Data Quality and Computable Phenotypes on Potential Racial Disparities in Predicting Healthcare Utilization Among Type 2 Diabetes Populations.

作者信息

Sood Priyanka D, Liu Star, Pandya Chintan, Kalyani Rita R, Lehmann Harold P, Kharrazi Hadi

机构信息

Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA.

Johns Hopkins School of Medicine, Baltimore, MD, USA.

出版信息

J Racial Ethn Health Disparities. 2025 May 27. doi: 10.1007/s40615-025-02485-8.

Abstract

INTRODUCTION

Type 2 diabetes (T2D) computable phenotypes identify different denominator populations for downstream tasks. Differences in racial composition could introduce bias and lead to disparate disease management. The objective of this study was to assess potential racial disparities in predicting T2D healthcare utilization introduced by data quality and computable phenotypes.

METHODS

Four published and one local T2D phenotypes were applied to the EHR and claims datasets of a large academic medical center. Population characteristics were compared across phenotypes, stratified by race. We induced data incompleteness, inaccuracy, and untimeliness to measure the impact on denominator racial composition. We trained logistic classification models on each of the phenotype-specific populations separately and compared disparities in utilization prediction (i.e., inpatients (IP) and emergency room (ER) admissions). Model performance, such as mean AUC and positive/negative predictive values, were compared across phenotypes, stratified by race.

RESULTS

Different T2D computable phenotypes identified populations with modestly different racial compositions. Black T2D patients had the highest average admissions to ER compared to other racial groups. Induced data quality challenges diminished patient counts across all racial groups proportionally. Charlson comorbidity score had the highest odds ratio in predicting IP and ER admissions across phenotypes and race groups. Specific T2D phenotypes showed the highest and lowest mean AUCs in predicting IP and ER admissions in Black and White populations; however, such results were not observed among Asian/Other populations.

CONCLUSION

Utilization prediction differed among phenotypes and race groups. Understanding the complexities behind phenotypes, data quality, and predictive models could mitigate health disparity further downstream and inform clinical research and disease management.

摘要

引言

2型糖尿病(T2D)可计算表型为下游任务确定了不同的分母人群。种族构成的差异可能会引入偏差并导致疾病管理的差异。本研究的目的是评估数据质量和可计算表型在预测T2D医疗保健利用率方面潜在的种族差异。

方法

将四种已发表的和一种本地的T2D表型应用于一家大型学术医疗中心的电子健康记录(EHR)和理赔数据集。按种族分层,比较各表型的人群特征。我们引入数据不完整、不准确和不及时的情况,以衡量对分母种族构成的影响。我们分别在每个特定表型的人群上训练逻辑分类模型,并比较利用率预测(即住院患者(IP)和急诊室(ER)入院)中的差异。按种族分层,比较各表型之间的模型性能,如平均AUC以及阳性/阴性预测值。

结果

不同的T2D可计算表型确定的人群种族构成略有不同。与其他种族群体相比,黑人T2D患者的急诊室平均入院率最高。引入的数据质量挑战按比例减少了所有种族群体的患者数量。在预测各表型和种族群体的住院患者和急诊室入院方面,查尔森合并症评分的优势比最高。特定的T2D表型在预测黑人和白人人群的住院患者和急诊室入院方面显示出最高和最低的平均AUC;然而,在亚洲/其他人群中未观察到此类结果。

结论

利用率预测在表型和种族群体之间存在差异。了解表型、数据质量和预测模型背后的复杂性,可以在更下游减轻健康差异,并为临床研究和疾病管理提供信息。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验