Department of Health Services Policy and Management, Arnold School of Public Health, University of South Carolina, Columbia, South Carolina, United States of America.
Department of Epidemiology & Biostatistics, Arnold School of Public Health, University of South Carolina, Columbia, South Carolina, United States of America.
PLoS One. 2022 Oct 31;17(10):e0276923. doi: 10.1371/journal.pone.0276923. eCollection 2022.
Identifying the time of SARS-CoV-2 viral infection relative to specific gestational weeks is critical for delineating the role of viral infection timing in adverse pregnancy outcomes. However, this task is difficult when it comes to Electronic Health Records (EHR). In combating the COVID-19 pandemic for maternal health, we sought to develop and validate a clinical information extraction algorithm to detect the time of clinical events relative to gestational weeks.
We used EHR from the National COVID Cohort Collaborative (N3C), in which the EHR are normalized by the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM). We performed EHR phenotyping, resulting in 270,897 pregnant women (June 1st, 2018 to May 31st, 2021). We developed a rule-based algorithm and performed a multi-level evaluation to test content validity and clinical validity, and extreme length of gestation (<150 or >300).
The algorithm identified 296,194 pregnancies (16,659 COVID-19, 174,744 without COVID-19) in 270,897 pregnant women. For inferring gestational age, 95% cases (n = 40) have moderate-high accuracy (Cohen's Kappa = 0.62); 100% cases (n = 40) have moderate-high granularity of temporal information (Cohen's Kappa = 1). For inferring delivery dates, the accuracy is 100% (Cohen's Kappa = 1). The accuracy of gestational age detection for the extreme length of gestation is 93.3% (Cohen's Kappa = 1). Mothers with COVID-19 showed higher prevalence in obesity or overweight (35.1% vs. 29.5%), diabetes (17.8% vs. 17.0%), chronic obstructive pulmonary disease (0.2% vs. 0.1%), respiratory distress syndrome or acute respiratory failure (1.8% vs. 0.2%).
We explored the characteristics of pregnant women by different gestational weeks of SARS-CoV-2 infection with our algorithm. TED-PC is the first to infer the exact gestational week linked with every clinical event from EHR and detect the timing of SARS-CoV-2 infection in pregnant women.
The algorithm shows excellent clinical validity in inferring gestational age and delivery dates, which supports multiple EHR cohorts on N3C studying the impact of COVID-19 on pregnancy.
确定 SARS-CoV-2 病毒感染相对于特定妊娠周数的时间对于阐明病毒感染时间在不良妊娠结局中的作用至关重要。然而,在电子健康记录(EHR)中,这一任务具有挑战性。在为母婴健康应对 COVID-19 大流行时,我们旨在开发和验证一种临床信息提取算法,以检测临床事件相对于妊娠周数的时间。
我们使用了来自国家 COVID 队列协作(N3C)的 EHR,其中 EHR 通过观察医学结果合作组织(OMOP)通用数据模型(CDM)进行标准化。我们进行了 EHR 表型分析,共涉及 270897 名孕妇(2018 年 6 月 1 日至 2021 年 5 月 31 日)。我们开发了基于规则的算法,并进行了多层次评估,以测试内容有效性和临床有效性,以及极长时间的妊娠(<150 或>300)。
该算法在 270897 名孕妇中确定了 296194 例妊娠(16659 例 COVID-19,174744 例无 COVID-19)。在推断胎龄方面,95%的病例(n=40)具有中高度准确性(科恩氏kappa=0.62);100%的病例(n=40)具有中高度的时间信息粒度(科恩氏kappa=1)。在推断分娩日期方面,准确率为 100%(科恩氏kappa=1)。对极长时间妊娠的胎龄检测准确率为 93.3%(科恩氏kappa=1)。患有 COVID-19 的母亲肥胖或超重的发生率较高(35.1%比 29.5%)、糖尿病(17.8%比 17.0%)、慢性阻塞性肺疾病(0.2%比 0.1%)、呼吸窘迫综合征或急性呼吸衰竭(1.8%比 0.2%)。
我们使用我们的算法探索了不同 SARS-CoV-2 感染妊娠周数的孕妇特征。TED-PC 是第一个从 EHR 中推断出与每个临床事件相关的确切妊娠周数,并检测孕妇中 SARS-CoV-2 感染时间的算法。
该算法在推断胎龄和分娩日期方面具有出色的临床有效性,为 N3C 上的多个 EHR 队列研究 COVID-19 对妊娠的影响提供了支持。