Suppr超能文献

在一个名为国家临床队列协作组的全国抽样电子健康记录库中识别艾滋病毒感染者或高危人群:计算表型研究

Identifying People Living With or Those at Risk for HIV in a Nationally Sampled Electronic Health Record Repository Called the National Clinical Cohort Collaborative: Computational Phenotyping Study.

作者信息

Hurwitz Eric, Varley Cara D, Anzolone A Jerrod, Madhira Vithal, Olex Amy L, Sun Jing, Vaidya Dimple, Fadul Nada, Islam Jessica Y, Jackson Lesley E, Wilkins Kenneth J, Butzin-Dozier Zachary, Li Dongmei, Safo Sandra E, McMurry Julie A, Maheria Pooja, Williams Tommy, Hassan Shukri A, Haendel Melissa A, Patel Rena C

机构信息

Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States.

School of Medicine, Oregon Health & Science University, Portland, OR, United States.

出版信息

JMIR Med Inform. 2025 Jul 11;13:e68143. doi: 10.2196/68143.

Abstract

BACKGROUND

Electronic health records (EHRs) provide valuable insights to address clinical and epidemiological research concerning HIV, including the disproportionate impact of the COVID-19 pandemic on people living with HIV. To identify this population, most studies using EHR or claims databases start with diagnostic codes, which can result in misclassification without further refinement using drug or laboratory data. Furthermore, given that antiretrovirals now have indications for both HIV and COVID-19 (ie, ritonavir in nirmatrelvir/ritonavir), new phenotyping methods are needed to better capture people living with HIV. Therefore, we created a generalizable and innovative method to robustly identify people living with HIV, preexposure prophylaxis (PrEP) users, postexposure prophylaxis (PEP) users, and people not living with HIV using granular clinical data after the emergence of COVID-19.

OBJECTIVE

The primary aim of this study was to use computational phenotyping in EHR data to identify people living with HIV (cohort 1), PrEP users (cohort 2), PEP users (cohort 3), or "none of the above" (people not living with HIV; cohort 4) and describe COVID-19-related characteristics among these cohorts.

METHODS

We used diagnostic and laboratory measurements and drug concepts in the National Clinical Cohort Collaborative to create a computational phenotype for the 4 cohorts with confidence levels. For robustness, we conducted a randomly sampled, blinded clinician annotation to assess precision. We calculated the distribution of demographics, comorbidities, and COVID-19 variables among the 4 cohorts.

RESULTS

We identified 132,664 people living with HIV with a high level of confidence, 36,088 PrEP users, 4120 PEP users, and 20,639,675 people not living with HIV. Most people living with HIV were identified by a combination of medical conditions, laboratory measurements, and drug exposures (74,809/132,664, 56.4%), followed by laboratory measurements and drug exposures (15,241/132,664, 11.5%) and then by medical conditions and drug exposures (14,595/132,664, 11%). A higher proportion of people living with HIV experienced COVID-19-related hospitalization (4650,132,664, 3.5%) or mortality (828/132,664, 0.6%) and all-cause mortality (2083/132,664, 1.6%) compared to other cohorts.

CONCLUSIONS

Using an extensive phenotyping algorithm leveraging granular data in an EHR repository, we have identified people living with HIV, people not living with HIV, PrEP users, and PEP users. Our findings offer transferable lessons to optimize future EHR phenotyping for these cohorts.

摘要

背景

电子健康记录(EHR)为解决有关艾滋病毒的临床和流行病学研究提供了有价值的见解,包括新冠疫情对艾滋病毒感染者的不成比例影响。为了识别这一人群,大多数使用EHR或理赔数据库的研究都从诊断代码开始,若不使用药物或实验室数据进行进一步细化,可能会导致错误分类。此外,鉴于抗逆转录病毒药物目前对艾滋病毒和新冠都有适应症(如用于奈玛特韦/利托那韦片的利托那韦),需要新的表型分析方法来更好地识别艾滋病毒感染者。因此,我们创建了一种可推广的创新方法,利用新冠疫情出现后详细的临床数据,来可靠地识别艾滋病毒感染者、暴露前预防(PrEP)使用者、暴露后预防(PEP)使用者以及未感染艾滋病毒的人群。

目的

本研究的主要目的是利用EHR数据中的计算表型分析来识别艾滋病毒感染者(队列1)、PrEP使用者(队列2)、PEP使用者(队列3)或“以上都不是”(未感染艾滋病毒的人群;队列4),并描述这些队列中与新冠相关的特征。

方法

我们利用国家临床队列协作中的诊断和实验室测量数据以及药物概念,为这4个队列创建了具有置信水平的计算表型。为确保稳健性,我们进行了随机抽样的、盲法的临床医生注释以评估精确性。我们计算了这4个队列中的人口统计学、合并症和新冠变量的分布情况。

结果

我们高置信度地识别出132,664名艾滋病毒感染者、36,088名PrEP使用者、4120名PEP使用者以及20,639,675名未感染艾滋病毒的人群。大多数艾滋病毒感染者是通过医疗状况、实验室测量和药物暴露的组合来识别的(74,809/132,664,56.4%),其次是通过实验室测量和药物暴露(15,241/132,664,11.5%),然后是通过医疗状况和药物暴露(14,595/132,664,11%)。与其他队列相比,艾滋病毒感染者中经历与新冠相关住院治疗的比例更高(4650/132,664,3.5%)或死亡率更高(828/132,664,0.6%)以及全因死亡率更高(2083/132,664,1.6%)。

结论

通过在EHR存储库中利用详细数据的广泛表型分析算法,我们识别出了艾滋病毒感染者、未感染艾滋病毒的人群、PrEP使用者和PEP使用者。我们的研究结果为优化这些队列未来的EHR表型分析提供了可借鉴的经验。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/39ea/12299939/38582ab51c79/medinform_v13i1e68143_fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验