Department of Psychiatry and Behavioral Sciences, University of California San Francisco, 513 Parnassus Ave., Health Sciences East 9th floor HSE901E, San Francisco, CA, 94143, USA.
Institute for Human Genetics, University of California San Francisco, San Francisco, CA, 94143, USA.
Genome Med. 2021 Oct 29;13(1):172. doi: 10.1186/s13073-021-00972-1.
Deletions and duplications of the multigenic 16p11.2 and 22q11.2 copy number variant (CNV) regions are associated with brain-related disorders including schizophrenia, intellectual disability, obesity, bipolar disorder, and autism spectrum disorder (ASD). The contribution of individual CNV genes to each of these identified phenotypes is unknown, as well as the contribution of these CNV genes to other potentially subtler health implications for carriers. Hypothesizing that DNA copy number exerts most effects via impacts on RNA expression, we attempted a novel in silico fine-mapping approach in non-CNV carriers using both GWAS and biobank data.
We first asked whether gene expression level in any individual gene in the CNV region alters risk for a known CNV-associated behavioral phenotype(s). Using transcriptomic imputation, we performed association testing for CNV genes within large genotyped cohorts for schizophrenia, IQ, BMI, bipolar disorder, and ASD. Second, we used a biobank containing electronic health data to compare the medical phenome of CNV carriers to controls within 700,000 individuals in order to investigate the full spectrum of health effects of the CNVs. Third, we used genotypes for over 48,000 individuals within the biobank to perform phenome-wide association studies between imputed expressions of individual 16p11.2 and 22q11.2 genes and over 1500 health traits.
Using large genotyped cohorts, we found individual genes within 16p11.2 associated with schizophrenia (TMEM219, INO80E, YPEL3), BMI (TMEM219, SPN, TAOK2, INO80E), and IQ (SPN), using conditional analysis to identify upregulation of INO80E as the driver of schizophrenia, and downregulation of SPN and INO80E as increasing BMI. We identified both novel and previously observed over-represented traits within the electronic health records of 16p11.2 and 22q11.2 CNV carriers. In the phenome-wide association study, we found seventeen significant gene-trait pairs, including psychosis (NPIPB11, SLX1B) and mood disorders (SCARF2), and overall enrichment of mental traits.
Our results demonstrate how integration of genetic and clinical data aids in understanding CNV gene function and implicates pleiotropy and multigenicity in CNV biology.
多基因 16p11.2 和 22q11.2 拷贝数变异 (CNV) 区域的缺失和重复与包括精神分裂症、智力障碍、肥胖、双相情感障碍和自闭症谱系障碍 (ASD) 在内的与大脑相关的疾病有关。每个已确定表型中个体 CNV 基因的贡献尚不清楚,这些 CNV 基因对携带者其他潜在更微妙的健康影响的贡献也尚不清楚。假设 DNA 拷贝数主要通过对 RNA 表达的影响发挥作用,我们使用 GWAS 和生物库数据在非 CNV 携带者中尝试了一种新颖的基于计算的精细映射方法。
我们首先询问 CNV 区域中任何单个基因的表达水平是否会改变已知与 CNV 相关的行为表型的风险。使用转录组学推断,我们对精神分裂症、智商、BMI、双相情感障碍和 ASD 的大型基因分型队列中的 CNV 基因进行了关联测试。其次,我们使用包含电子健康数据的生物库来比较 CNV 携带者与 700,000 名个体中的对照之间的医学表型,以调查 CNV 的全部健康影响。第三,我们使用生物库中超过 48,000 名个体的基因型在个体 16p11.2 和 22q11.2 基因的推断表达与 1500 多种健康特征之间进行全表型关联研究。
使用大型基因分型队列,我们发现 16p11.2 内的单个基因与精神分裂症 (TMEM219、INO80E、YPEL3)、BMI (TMEM219、SPN、TAOK2、INO80E) 和智商 (SPN) 相关,使用条件分析确定 INO80E 的上调是精神分裂症的驱动因素,而 SPN 和 INO80E 的下调则会增加 BMI。我们在 16p11.2 和 22q11.2 CNV 携带者的电子健康记录中发现了既有新的也有以前观察到的过度表现的特征。在全表型关联研究中,我们发现了十七个显著的基因-特征对,包括精神病 (NPIPB11、SLX1B) 和情绪障碍 (SCARF2),以及整体精神特征的富集。
我们的研究结果表明,遗传和临床数据的整合如何有助于理解 CNV 基因功能,并暗示 CNV 生物学中的多效性和多基因性。