Program in Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts, United States of America.
PLoS Genet. 2010 Sep 9;6(9):e1001097. doi: 10.1371/journal.pgen.1001097.
Investigators have linked rare copy number variation (CNVs) to neuropsychiatric diseases, such as schizophrenia. One hypothesis is that CNV events cause disease by affecting genes with specific brain functions. Under these circumstances, we expect that CNV events in cases should impact brain-function genes more frequently than those events in controls. Previous publications have applied "pathway" analyses to genes within neuropsychiatric case CNVs to show enrichment for brain-functions. While such analyses have been suggestive, they often have not rigorously compared the rates of CNVs impacting genes with brain function in cases to controls, and therefore do not address important confounders such as the large size of brain genes and overall differences in rates and sizes of CNVs. To demonstrate the potential impact of confounders, we genotyped rare CNV events in 2,415 unaffected controls with Affymetrix 6.0; we then applied standard pathway analyses using four sets of brain-function genes and observed an apparently highly significant enrichment for each set. The enrichment is simply driven by the large size of brain-function genes. Instead, we propose a case-control statistical test, cnv-enrichment-test, to compare the rate of CNVs impacting specific gene sets in cases versus controls. With simulations, we demonstrate that cnv-enrichment-test is robust to case-control differences in CNV size, CNV rate, and systematic differences in gene size. Finally, we apply cnv-enrichment-test to rare CNV events published by the International Schizophrenia Consortium (ISC). This approach reveals nominal evidence of case-association in neuronal-activity and the learning gene sets, but not the other two examined gene sets. The neuronal-activity genes have been associated in a separate set of schizophrenia cases and controls; however, testing in independent samples is necessary to definitively confirm this association. Our method is implemented in the PLINK software package.
研究人员已经将罕见的拷贝数变异(CNV)与神经精神疾病(如精神分裂症)联系起来。一种假设是,CNV 事件通过影响具有特定大脑功能的基因来导致疾病。在这种情况下,我们预计病例中的 CNV 事件比对照中的事件更频繁地影响大脑功能基因。以前的出版物已经将“通路”分析应用于神经精神病例 CNV 中的基因,以显示对大脑功能的富集。虽然这些分析具有启示性,但它们通常没有严格比较病例中影响大脑功能基因的 CNV 事件的发生率与对照,因此不能解决基因大小较大和 CNV 发生率和大小总体差异等重要混杂因素。为了证明混杂因素的潜在影响,我们对 2415 名未受影响的对照者进行了罕见的 CNV 事件基因分型 Affymetrix 6.0;然后,我们使用四组大脑功能基因应用标准通路分析,并观察到每个组都有明显的高度显著富集。这种富集仅仅是由大脑功能基因的庞大规模驱动的。相反,我们提出了一种病例对照统计检验,即 cnv-enrichment-test,用于比较病例中特定基因集的 CNV 影响率与对照。通过模拟,我们证明 cnv-enrichment-test 对 CNV 大小、CNV 率和基因大小的系统差异的病例对照差异具有稳健性。最后,我们将 cnv-enrichment-test 应用于国际精神分裂症联合会(ISC)发表的罕见 CNV 事件。这种方法揭示了神经元活性和学习基因集中的病例关联的名义证据,但其他两个检查的基因集中没有。神经元活性基因在一组独立的精神分裂症病例和对照中已经相关;然而,在独立样本中进行测试是确定这种关联所必需的。我们的方法在 PLINK 软件包中实现。