Harvard Medical School, Cambridge, MA, USA.
University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA.
J Biomed Inform. 2023 Mar;139:104306. doi: 10.1016/j.jbi.2023.104306. Epub 2023 Feb 3.
In electronic health records, patterns of missing laboratory test results could capture patients' course of disease as well as reflect clinician's concerns or worries for possible conditions. These patterns are often understudied and overlooked. This study aims to identify informative patterns of missingness among laboratory data collected across 15 healthcare system sites in three countries for COVID-19 inpatients.
We collected and analyzed demographic, diagnosis, and laboratory data for 69,939 patients with positive COVID-19 PCR tests across three countries from 1 January 2020 through 30 September 2021. We analyzed missing laboratory measurements across sites, missingness stratification by demographic variables, temporal trends of missingness, correlations between labs based on missingness indicators over time, and clustering of groups of labs based on their missingness/ordering pattern.
With these analyses, we identified mapping issues faced in seven out of 15 sites. We also identified nuances in data collection and variable definition for the various sites. Temporal trend analyses may support the use of laboratory test result missingness patterns in identifying severe COVID-19 patients. Lastly, using missingness patterns, we determined relationships between various labs that reflect clinical behaviors.
In this work, we use computational approaches to relate missingness patterns to hospital treatment capacity and highlight the heterogeneity of looking at COVID-19 over time and at multiple sites, where there might be different phases, policies, etc. Changes in missingness could suggest a change in a patient's condition, and patterns of missingness among laboratory measurements could potentially identify clinical outcomes. This allows sites to consider missing data as informative to analyses and help researchers identify which sites are better poised to study particular questions.
在电子健康记录中,实验室检测结果缺失的模式可以捕捉患者的疾病进程,并反映临床医生对可能出现的情况的关注或担忧。这些模式往往未得到充分研究和关注。本研究旨在确定三个国家 15 个医疗系统站点在收集的 COVID-19 住院患者实验室数据中,实验室数据缺失的有意义模式。
我们收集并分析了 2020 年 1 月 1 日至 2021 年 9 月 30 日期间来自三个国家的 69939 例 COVID-19 阳性 PCR 检测患者的人口统计学、诊断和实验室数据。我们分析了各站点的实验室检测缺失情况,根据人口统计学变量对缺失情况进行分层,缺失情况的时间趋势,基于时间的缺失指标的实验室之间的相关性,以及基于缺失/排序模式的实验室分组聚类。
通过这些分析,我们在 15 个站点中的 7 个站点中发现了映射问题。我们还确定了各个站点在数据收集和变量定义方面的细微差别。时间趋势分析可能支持使用实验室检测结果缺失模式来识别严重 COVID-19 患者。最后,我们使用缺失模式确定了反映临床行为的各种实验室之间的关系。
在这项工作中,我们使用计算方法将缺失模式与医院治疗能力联系起来,并强调了随着时间的推移和在多个站点观察 COVID-19 的异质性,可能存在不同的阶段、政策等。缺失情况的变化可能表明患者病情的变化,实验室测量的缺失模式可能潜在地识别临床结果。这使得站点能够将缺失数据视为对分析有意义,并帮助研究人员确定哪些站点更适合研究特定问题。