Institute for Systems Biology, Seattle, WA 98109.
Inova Translational Medicine Institute, Inova Health System and Inova Fairfax Medical Center, Falls Church, VA 22042.
Proc Natl Acad Sci U S A. 2019 Mar 19;116(12):5819-5827. doi: 10.1073/pnas.1716314116. Epub 2019 Mar 4.
Preterm birth (PTB) complications are the leading cause of long-term morbidity and mortality in children. By using whole blood samples, we integrated whole-genome sequencing (WGS), RNA sequencing (RNA-seq), and DNA methylation data for 270 PTB and 521 control families. We analyzed this combined dataset to identify genomic variants associated with PTB and secondary analyses to identify variants associated with very early PTB (VEPTB) as well as other subcategories of disease that may contribute to PTB. We identified differentially expressed genes (DEGs) and methylated genomic loci and performed expression and methylation quantitative trait loci analyses to link genomic variants to these expression and methylation changes. We performed enrichment tests to identify overlaps between new and known PTB candidate gene systems. We identified 160 significant genomic variants associated with PTB-related phenotypes. The most significant variants, DEGs, and differentially methylated loci were associated with VEPTB. Integration of all data types identified a set of 72 candidate biomarker genes for VEPTB, encompassing genes and those previously associated with PTB. Notably, PTB-associated genes RAB31 and RBPJ were identified by all three data types (WGS, RNA-seq, and methylation). Pathways associated with VEPTB include EGFR and prolactin signaling pathways, inflammation- and immunity-related pathways, chemokine signaling, IFN-γ signaling, and Notch1 signaling. Progress in identifying molecular components of a complex disease is aided by integrated analyses of multiple molecular data types and clinical data. With these data, and by stratifying PTB by subphenotype, we have identified associations between VEPTB and the underlying biology.
早产(PTB)并发症是儿童长期发病和死亡的主要原因。我们使用全血样本,整合了 270 例 PTB 和 521 例对照家庭的全基因组测序(WGS)、RNA 测序(RNA-seq)和 DNA 甲基化数据。我们分析了这个综合数据集,以识别与 PTB 相关的基因组变异,并进行了二次分析,以识别与非常早期 PTB(VEPTB)以及可能导致 PTB 的其他疾病亚类相关的变异。我们鉴定了差异表达基因(DEGs)和甲基化基因组位点,并进行了表达和甲基化数量性状位点分析,以将基因组变异与这些表达和甲基化变化联系起来。我们进行了富集测试,以识别新的和已知的 PTB 候选基因系统之间的重叠。我们确定了 160 个与 PTB 相关表型相关的显著基因组变异。最显著的变异、DEGs 和差异甲基化位点与 VEPTB 相关。所有数据类型的整合确定了一组 72 个 VEPTB 的候选生物标志物基因,包括基因和那些以前与 PTB 相关的基因。值得注意的是,PTB 相关基因 RAB31 和 RBPJ 被三种数据类型(WGS、RNA-seq 和甲基化)都鉴定出来了。与 VEPTB 相关的途径包括 EGFR 和催乳素信号通路、炎症和免疫相关途径、趋化因子信号、IFN-γ 信号和 Notch1 信号。通过对多种分子数据类型和临床数据进行综合分析,有助于识别复杂疾病的分子成分。有了这些数据,并通过亚表型对 PTB 进行分层,我们确定了 VEPTB 与潜在生物学之间的关联。