Klein Ari Z, Kunatharaju Shriya, Golder Su, Levine Lisa D, Figueiredo Jane C, Gonzalez-Hernandez Graciela
Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States.
Department of Genetics, University of Pennsylvania, Philadelphia, PA, United States.
J Med Internet Res. 2025 Jul 9;27:e66097. doi: 10.2196/66097.
Preterm birth, defined as birth at <37 weeks of gestation, is the leading cause of neonatal death globally and the second leading cause of infant mortality in the United States. There is mounting evidence that COVID-19 infection during pregnancy is associated with an increased risk of preterm birth; however, data remain limited by trimester of infection. The ability to study COVID-19 infection during the earlier stages of pregnancy has been limited by available sources of data.
The objective of this study was to use self-reports in large-scale social media data to assess the association between the trimester of COVID-19 infection and preterm birth.
In this retrospective cohort study, we used natural language processing and machine learning, followed by manual validation, to identify self-reports of pregnancy on Twitter and to search these users' collection of publicly available tweets for self-reports of COVID-19 infection during pregnancy and, subsequently, a preterm birth or term birth outcome. Among the users who reported their pregnancy on Twitter, we also identified a 1:1 age-matched control group, consisting of users with a due date before January 1, 2020-that is, without COVID-19 infection during pregnancy. We calculated the odds ratios (ORs) with 95% CIs to compare the frequency of preterm birth for pregnancies with and without COVID-19 infection and by the timing of infection: first trimester (1-13 weeks), second trimester (14-27 weeks), or third trimester (28-36 weeks).
Through August 2022, we identified 298 Twitter users who reported COVID-19 infection during pregnancy, a preterm birth or term birth outcome, and maternal age: 94 (31.5%) with first-trimester infection, 110 (36.9%) with second-trimester infection, and 95 (31.9%) with third-trimester infection. In total, 26 (8.8%) of these 298 users reported preterm birth: 8 (8.5%) with first-trimester infection, 7 (6.4%) with second-trimester infection, and 12 (12.6%) with third-trimester infection. In the 1:1 age-matched control group, 13 (4.4%) of the 298 users reported preterm birth. Overall, the odds of preterm birth were significantly higher for pregnancies with COVID-19 infection compared to those without (OR 2.08, 95% CI 1.06-4.28; P=.046). In particular, the odds of preterm birth were significantly higher for pregnancies with COVID-19 infection during the third trimester (OR 3.16, 95% CI 1.36-7.29; P=.007). The odds of preterm birth were not significantly higher for pregnancies with COVID-19 infection during the first trimester (OR 2.05, 95% CI 0.78-5.08; P=.12) or second trimester (OR 1.50, 95% CI 0.54-3.82; P=.44) compared to those without infection.
Based on self-reports in large-scale social media data, the results of our study suggest that COVID-19 infection particularly during the third trimester is associated with higher odds of preterm birth.
早产定义为妊娠<37周分娩,是全球新生儿死亡的主要原因,在美国是婴儿死亡的第二大原因。越来越多的证据表明,孕期感染新冠病毒与早产风险增加有关;然而,按感染孕周划分的数据仍然有限。孕期早期阶段研究新冠病毒感染的能力一直受到可用数据源的限制。
本研究的目的是利用大规模社交媒体数据中的自我报告,评估新冠病毒感染孕周与早产之间的关联。
在这项回顾性队列研究中,我们使用自然语言处理和机器学习,随后进行人工验证,以识别推特上的妊娠自我报告,并在这些用户公开可用的推文集中搜索孕期新冠病毒感染的自我报告,以及随后的早产或足月产结局。在推特上报告妊娠的用户中,我们还确定了一个1:1年龄匹配的对照组,由预产期在2020年1月1日之前的用户组成,即孕期未感染新冠病毒。我们计算了95%置信区间的比值比(OR),以比较有和没有新冠病毒感染的妊娠早产频率,以及按感染时间划分的情况:孕早期(1-13周)、孕中期(14-27周)或孕晚期(28-36周)。
截至2022年8月,我们识别出298名在孕期报告了新冠病毒感染、早产或足月产结局以及产妇年龄的推特用户:孕早期感染94人(31.5%),孕中期感染110人(36.9%),孕晚期感染95人(31.9%)。在这298名用户中,共有26人(8.8%)报告早产:孕早期感染8人(8.5%),孕中期感染7人(6.4%),孕晚期感染12人(12.6%)。在1:1年龄匹配的对照组中,298名用户中有13人(4.4%)报告早产。总体而言,与未感染新冠病毒的妊娠相比,感染新冠病毒的妊娠早产几率显著更高(OR 2.08,95% CI 1.06-4.28;P = 0.046)。特别是,孕晚期感染新冠病毒的妊娠早产几率显著更高(OR 3.16,95% CI 1.36-7.29;P = 0.007)。与未感染的妊娠相比,孕早期感染新冠病毒(OR 2.05,95% CI 0.78-5.08;P = 0.12)或孕中期感染新冠病毒(OR 1.50,95% CI 0.54-3.82;P = 0.44)的妊娠早产几率没有显著更高。
基于大规模社交媒体数据中的自我报告,我们的研究结果表明,新冠病毒感染尤其是在孕晚期与早产几率较高有关。