Jiang Kai, Cao Tru
The University of Texas Health Science Center at Houston School of Public Health, Houston, Texas, United States.
AMIA Jt Summits Transl Sci Proc. 2024 May 31;2024:555-564. eCollection 2024.
Automatic HIV phenotyping is needed for HIV research based on electronic health records (EHRs). MIMIC-IV, an extension of MIMIC-III, contains more than 520,000 hospital admissions and has become a valuable EHR database for secondary medical research. However, there was no prior phenotyping algorithm to extract HIV cases from MIMIC-IV, which requires a comprehensive knowledge of the database. Moreover, previous HIV phenotyping algorithms did not consider the new HIV-1/HIV-2 antibody differentiation immunoassay tests that MIMIC-IV contains. Our work provided insight into the structure and data elements in MIMIC-IV and proposed a new HIV phenotyping algorithm to fill in these gaps. The results included MIMIC-IV's data tables and elements used, 1,781 and 1,843 HIV cases from MIMIC-IV's versions 0.4 and 2.1, respectively, and summary statistics of these two HIV case cohorts. They could be used for the development of statistical and machine learning models in future studies about the disease.
基于电子健康记录(EHR)的HIV研究需要自动进行HIV表型分析。MIMIC-IV是MIMIC-III的扩展版本,包含超过52万例住院病例,已成为二级医学研究中一个有价值的EHR数据库。然而,之前没有从MIMIC-IV中提取HIV病例的表型分析算法,这需要对该数据库有全面的了解。此外,以前的HIV表型分析算法没有考虑到MIMIC-IV中包含的新的HIV-1/HIV-2抗体鉴别免疫分析测试。我们的工作深入了解了MIMIC-IV的结构和数据元素,并提出了一种新的HIV表型分析算法来填补这些空白。结果包括MIMIC-IV使用的数据表和元素、分别来自MIMIC-IV 0.4版和2.1版的1781例和1843例HIV病例,以及这两个HIV病例队列的汇总统计数据。它们可用于未来关于该疾病的统计和机器学习模型的开发。