Suppr超能文献

HIV相关大数据研究中电子健康记录数据的生命周期:偏差实例及最小化潜在机会的定性研究

The Lifecycle of Electronic Health Record Data in HIV-Related Big Data Studies: Qualitative Study of Bias Instances and Potential Opportunities for Minimization.

作者信息

N'Diaye Arielle, Qiao Shan, Garrett Camryn, Khushf George, Zhang Jiajia, Li Xiaoming, Olatosi Bankole

机构信息

Department of Health Promotion, Education, and Behavior, Arnold School of Public Health, University of South Carolina, Discovery 410, 915 Greene Street, Columbia, SC, 29208, United States, 1 803-777-6844.

Department of Philosophy, College of Arts and Sciences, University of South Carolina, Columbia, SC, United States.

出版信息

J Med Internet Res. 2025 Aug 7;27:e71388. doi: 10.2196/71388.

Abstract

BACKGROUND

Electronic health record (EHR) data are widely used in public health research, including in HIV-related studies, but are limited by potential bias due to incomplete and inaccurate information, lack of generalizability, and lack of representativeness.

OBJECTIVE

This study explores how workflow processes among HIV health care providers (HCPs), data scientists, and state health department professionals may potentially introduce or minimize bias within EHR data.

METHODS

One focus group with 3 health department professionals working in HIV surveillance and 16 in-depth interviews (ie, 5 people with HIV, 5 HCPs, 5 data scientists, and 1 health department professional providing retention-in-care services) were conducted with participants purposively sampled in South Carolina from August 2023 to April 2024. All interviews were transcribed verbatim and analyzed using a constructivist grounded theory approach, where transcripts were first coded and then focused, axial, and theoretically coded.

RESULTS

The EHR data lifecycle originates with people with HIV and HCPs in the clinical setting. Data scientists then curate EHR data and health department professionals manage and use the data for surveillance and policy decision-making. Throughout this lifecycle, the three primary stakeholders (ie, HCPs, data scientists, and health department professionals) identified challenges with EHR processes and provided their recommendations and accommodations in addressing the related challenges. HCPs reported the influence of socio-structural biases on their inquiry, interpretation, and documentation of social determinants of health (SDOH) information of people living with HIV, the influence of which is proposed to be mitigated through people living with HIV access to their EHRs. Data scientists identified limited data availability and representativeness as biasing the data they manage. Health department professionals face challenges with delayed and incomplete data, which may be addressed statistically but require consideration of the data's limitations. Overall, bias within the EHR data lifecycle persists because workflows are not intentionally structured to minimize bias and there is a diffusion of responsibility for data quality between the various stakeholders.

CONCLUSIONS

From the perspective of various stakeholders, this study describes the EHR data lifecycle and its associated challenges as well as stakeholders' accommodations and recommendations for mitigating and eliminating bias in EHR data. Based upon these findings, studies reliant on EHR data should adequately consider its challenges and limitations. Throughout the EHR data lifecycle, bias could be reduced through an inclusive, supportive health care environment, people living with HIV verification of SDOH information, the customization of data collection systems, and EHR data inspection for completeness, accuracy, and timeliness. Future research is needed to further identify instances where bias is introduced and how it can best be mitigated and eliminated across the EHR data lifecycle. Systematic changes are necessary to reduce instances of bias between data workflows and stakeholders.

摘要

背景

电子健康记录(EHR)数据广泛应用于公共卫生研究,包括与艾滋病病毒相关的研究,但由于信息不完整和不准确、缺乏普遍性和代表性,存在潜在偏差。

目的

本研究探讨艾滋病病毒医疗服务提供者(HCP)、数据科学家和州卫生部门专业人员之间的工作流程如何可能在EHR数据中引入偏差或使其最小化。

方法

2023年8月至2024年4月在南卡罗来纳州进行了一项焦点小组讨论,参与者为3名从事艾滋病病毒监测的卫生部门专业人员,以及16次深入访谈(即5名艾滋病病毒感染者、5名HCP、5名数据科学家和1名提供持续护理服务的卫生部门专业人员)。所有访谈均逐字记录,并采用建构主义扎根理论方法进行分析,先对记录进行编码,然后进行聚焦、轴心和理论编码。

结果

EHR数据生命周期始于临床环境中的艾滋病病毒感染者和HCP。数据科学家随后整理EHR数据,卫生部门专业人员管理和使用这些数据进行监测和政策决策。在整个生命周期中,三个主要利益相关者(即HCP、数据科学家和卫生部门专业人员)确定了EHR流程中的挑战,并就应对相关挑战提供了建议和调整措施。HCP报告了社会结构偏差对其对艾滋病病毒感染者健康的社会决定因素(SDOH)信息的询问、解释和记录的影响,建议通过让艾滋病病毒感染者访问其EHR来减轻这种影响。数据科学家认为数据可用性和代表性有限会使他们管理的数据产生偏差。卫生部门专业人员面临数据延迟和不完整的挑战,这可以通过统计方法解决,但需要考虑数据的局限性。总体而言,EHR数据生命周期内的偏差仍然存在,因为工作流程并非有意设计以最小化偏差,而且数据质量责任在不同利益相关者之间分散。

结论

本研究从不同利益相关者的角度描述了EHR数据生命周期及其相关挑战,以及利益相关者为减轻和消除EHR数据偏差所做的调整和建议。基于这些发现,依赖EHR数据的研究应充分考虑其挑战和局限性。在整个EHR数据生命周期中,可以通过包容、支持性的医疗环境、艾滋病病毒感染者对SDOH信息的核实、数据收集系统的定制以及对EHR数据的完整性、准确性和及时性检查来减少偏差。未来需要进一步研究,以确定在EHR数据生命周期中引入偏差的具体情况,以及如何最好地减轻和消除偏差。需要进行系统性变革,以减少数据工作流程和利益相关者之间的偏差情况。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b44c/12331130/9d9762eb74ee/jmir-v27-e71388-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验