Beijing Obstetrics and Gynecology Hospital, Capital Medical University, Beijing Maternal and Child Health Care Hospital, Beijing 100026, P. R. China.
BGI-Beijing Clinical Laboratories, BGI-Shenzhen, Beijing 101300, P. R. China.
Brief Bioinform. 2023 Nov 22;25(1). doi: 10.1093/bib/bbad492.
Gestational diabetes mellitus (GDM) is a common complication of pregnancy, which has significant adverse effects on both the mother and fetus. The incidence of GDM is increasing globally, and early diagnosis is critical for timely treatment and reducing the risk of poor pregnancy outcomes. GDM is usually diagnosed and detected after 24 weeks of gestation, while complications due to GDM can occur much earlier. Copy number variations (CNVs) can be a possible biomarker for GDM diagnosis and screening in the early gestation stage. In this study, we proposed a machine-learning method to screen GDM in the early stage of gestation using cell-free DNA (cfDNA) sequencing data from maternal plasma. Five thousand and eighty-five patients from north regions of Mainland China, including 1942 GDM, were recruited. A non-overlapping sliding window method was applied for CNV coverage screening on low-coverage (~0.2×) sequencing data. The CNV coverage was fed to a convolutional neural network with attention architecture for the binary classification. The model achieved a classification accuracy of 88.14%, precision of 84.07%, recall of 93.04%, F1-score of 88.33% and AUC of 96.49%. The model identified 2190 genes associated with GDM, including DEFA1, DEFA3 and DEFB1. The enriched gene ontology (GO) terms and KEGG pathways showed that many identified genes are associated with diabetes-related pathways. Our study demonstrates the feasibility of using cfDNA sequencing data and machine-learning methods for early diagnosis of GDM, which may aid in early intervention and prevention of adverse pregnancy outcomes.
妊娠期糖尿病(GDM)是一种常见的妊娠并发症,对母婴均有显著的不良影响。全球范围内 GDM 的发病率呈上升趋势,早期诊断对于及时治疗和降低不良妊娠结局的风险至关重要。GDM 通常在妊娠 24 周后诊断和检测,但由于 GDM 引起的并发症可能发生得更早。拷贝数变异(CNVs)可能是 GDM 早期诊断和筛查的一个潜在生物标志物。在这项研究中,我们提出了一种机器学习方法,利用母体血浆游离 DNA(cfDNA)测序数据,在妊娠早期筛查 GDM。从中国北方地区招募了 5085 名患者,其中包括 1942 名 GDM 患者。应用非重叠滑动窗口方法对低覆盖度(~0.2×)测序数据进行 CNV 覆盖筛查。将 CNV 覆盖度输入具有注意力架构的卷积神经网络进行二分类。该模型的分类准确率为 88.14%,精度为 84.07%,召回率为 93.04%,F1 得分为 88.33%,AUC 为 96.49%。该模型鉴定出 2190 个与 GDM 相关的基因,包括 DEFA1、DEFA3 和 DEFB1。富集的基因本体(GO)术语和 KEGG 通路表明,许多鉴定出的基因与糖尿病相关通路有关。我们的研究表明,使用 cfDNA 测序数据和机器学习方法进行 GDM 的早期诊断是可行的,这可能有助于早期干预和预防不良妊娠结局。