Center for Neurogenetics, Feil Family Brain and Mind Research Institute, Weill Cornell Medicine, New York, NY 10021.
Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY 10065.
Proc Natl Acad Sci U S A. 2021 Dec 21;118(51). doi: 10.1073/pnas.2106844118.
Spina bifida (SB) is a debilitating birth defect caused by multiple gene and environment interactions. Though SB shows non-Mendelian inheritance, genetic factors contribute to an estimated 70% of cases. Nevertheless, identifying human mutations conferring SB risk is challenging due to its relative rarity, genetic heterogeneity, incomplete penetrance, and environmental influences that hamper genome-wide association studies approaches to untargeted discovery. Thus, SB genetic studies may suffer from population substructure and/or selection bias introduced by typical candidate gene searches. We report a population based, ancestry-matched whole-genome sequence analysis of SB genetic predisposition using a systems biology strategy to interrogate 298 case-control subject genomes (149 pairs). Genes that were enriched in likely gene disrupting (LGD), rare protein-coding variants were subjected to machine learning analysis to identify genes in which LGD variants occur with a different frequency in cases versus controls and so discriminate between these groups. Those genes with high discriminatory potential for SB significantly enriched pathways pertaining to carbon metabolism, inflammation, innate immunity, cytoskeletal regulation, and essential transcriptional regulation consistent with their having impact on the pathogenesis of human SB. Additionally, an interrogation of conserved noncoding sequences identified robust variant enrichment in regulatory regions of several transcription factors critical to embryonic development. This genome-wide perspective offers an effective approach to the interrogation of coding and noncoding sequence variant contributions to rare complex genetic disorders.
脊柱裂(SB)是一种由多种基因和环境相互作用引起的使人衰弱的出生缺陷。尽管 SB 表现出非孟德尔遗传,但遗传因素估计占 70%的病例。然而,由于其相对罕见、遗传异质性、不完全外显率以及影响全基因组关联研究方法进行无目标发现的环境影响,确定赋予 SB 风险的人类突变是具有挑战性的。因此,SB 遗传研究可能受到人群亚结构和/或典型候选基因搜索引入的选择偏差的影响。我们报告了一项基于人群的、匹配祖先的全基因组序列分析,用于研究 SB 遗传易感性,使用系统生物学策略来研究 298 例病例对照受试者的基因组(149 对)。在可能导致基因破坏(LGD)的基因中富集的基因,罕见的蛋白质编码变体,进行机器学习分析,以识别在病例与对照组中发生 LGD 变体的频率不同的基因,从而区分这些组。那些对 SB 具有高鉴别潜力的基因,显著富集了与碳代谢、炎症、先天免疫、细胞骨架调节和基本转录调节相关的途径,这与其对人类 SB 发病机制的影响一致。此外,对保守非编码序列的询问确定了几个对胚胎发育至关重要的转录因子的调控区域中强大的变体富集。这种全基因组的视角为研究编码和非编码序列变异对罕见复杂遗传疾病的贡献提供了一种有效的方法。