Guo Zhiwei, Wang Ke, Huang Xiang, Li Kun, Ouyang Guojun, Yang Xu, Tan Jiayu, Shi Haihong, Luo Liangping, Zhang Min, Han Bowei, Zhai Xiangming, Deng Jinhai, Beatson Richard, Wu Yingsong, Yang Fang, Yang Xuexi, Tang Jia
Department of Obstetrics and Gynaecology, Guangzhou First People's Hospital, School of Medicine, South China University of Technology, Guangzhou, China.
NHC Key Laboratory of Male Reproduction and Genetics, Guangdong Provincial Reproductive Science Institute (Guangdong Provincial Fertility Hospital), Guangzhou, China.
PLoS Med. 2025 Apr 15;22(4):e1004571. doi: 10.1371/journal.pmed.1004571. eCollection 2025 Apr.
Preterm birth (PTB) occurs in approximately 11% of all births worldwide, resulting in significant morbidity and mortality for both mothers and their offspring. Identifying pregnancies at risk of preterm birth during early pregnancy may help improve interventions and reduce its incidence. Plasma cell-free DNA (cfDNA), derived from placenta and other maternal tissues, serves as a dynamic indicator of biological processes and pathological changes in pregnancy. These properties establish cfDNA as a valuable biomarker for investigating pregnancy complications, including PTB.
To date, there are few methods available for PTB prediction that have been developed with large sample sizes, high-throughput screening, and validated in independent cohorts. To address this gap, we established a large-scale, multi-center case-control study involving 2,590 pregnancies (2,072 full-term and 518 preterm) from three independent hospitals to develop a spontaneous preterm birth classifier. We performed whole-genome sequencing on cfDNA, focusing on promoter profiling (read depth of promoter regions spanning from -1 to +1 kb around transcriptional start sites). Using four machine learning models and two feature selection algorithms, we developed classifiers for predicting preterm birth. Among these, the classifier based on the support vector machine model, named PTerm (Promoter profiling classifier for preterm prediction), exhibited the highest area under the curve (AUC) value of 0.878 (0.852-0.904) following leave-one-out cross-validation. Additionally, PTerm exhibited strong performance in three independent validation cohorts, achieving an overall AUC of 0.849 (0.831-0.866).
In summary, PTerm demonstrated high accuracy in predicting preterm birth. Additionally, it can be utilized with current non-invasive prenatal test data without changing its procedures or increasing detection cost, making it easily adaptable for preclinical tests.
全球约11%的分娩为早产,这会给母亲及其后代带来严重的发病和死亡风险。在妊娠早期识别有早产风险的孕妇,可能有助于改进干预措施并降低早产发生率。源自胎盘和其他母体组织的血浆游离DNA(cfDNA),是妊娠期间生物过程和病理变化的动态指标。这些特性使cfDNA成为研究包括早产在内的妊娠并发症的有价值生物标志物。
迄今为止,很少有针对早产预测的方法是通过大样本量、高通量筛选开发出来并在独立队列中得到验证的。为填补这一空白,我们开展了一项大规模、多中心病例对照研究,纳入了来自三家独立医院的2590例妊娠(2072例足月产和518例早产),以开发一种自发性早产分类器。我们对cfDNA进行了全基因组测序,重点是启动子分析(转录起始位点周围-1至+1 kb范围内启动子区域的读取深度)。使用四种机器学习模型和两种特征选择算法,我们开发了用于预测早产的分类器。其中,基于支持向量机模型的分类器,名为PTerm(早产预测启动子分析分类器),在留一法交叉验证后表现出最高的曲线下面积(AUC)值,为0.878(0.852 - 0.904)。此外,PTerm在三个独立验证队列中表现出色,总体AUC为0.849(0.831 - 0.866)。
总之,PTerm在预测早产方面表现出很高的准确性。此外,它可以与当前的无创产前检测数据一起使用,而无需改变其程序或增加检测成本,使其易于应用于临床前检测。