Psychiatric Iowa Neuroimaging Consortium, University of Iowa Carver College of Medicine, Iowa City, IA 52242, USA.
Mol Psychiatry. 2012 Nov;17(11):1093-102. doi: 10.1038/mp.2011.108. Epub 2011 Aug 30.
Although schizophrenia is generally considered to occur as a consequence of multiple genes that interact with one another, very few methods have been developed to model epistasis. Phenotype definition has also been a major challenge for research on the genetics of schizophrenia. In this report, we use novel statistical techniques to address the high dimensionality of genomic data, and we apply a refinement in phenotype definition by basing it on the occurrence of brain changes during the early course of the illness, as measured by repeated magnetic resonance scans (i.e., an 'intermediate phenotype.') The method combines a machine-learning algorithm, the ensemble method using stochastic gradient boosting, with traditional general linear model statistics. We began with 14 genes that are relevant to schizophrenia, based on association studies or their role in neurodevelopment, and then used statistical techniques to reduce them to five genes and 17 single nucleotide polymorphisms (SNPs) that had a significant statistical interaction: five for PDE4B, four for RELN, four for ERBB4, three for DISC1 and one for NRG1. Five of the SNPs involved in these interactions replicate previous research in that, these five SNPs have previously been identified as schizophrenia vulnerability markers or implicate cognitive processes relevant to schizophrenia. This ability to replicate previous work suggests that our method has potential for detecting a meaningful epistatic relationship among the genes that influence brain abnormalities in schizophrenia.
尽管精神分裂症通常被认为是由相互作用的多个基因共同作用引起的,但很少有方法能够模拟上位性。表型定义也是精神分裂症遗传学研究的主要挑战。在本报告中,我们使用新的统计技术来解决基因组数据的高维性,并通过基于疾病早期大脑变化的发生来改进表型定义,这种变化是通过重复磁共振扫描来衡量的(即“中间表型”)。该方法结合了机器学习算法,即使用随机梯度提升的集成方法,以及传统的广义线性模型统计方法。我们从与精神分裂症相关的 14 个基因开始,这些基因基于关联研究或它们在神经发育中的作用,然后使用统计技术将它们减少到 5 个基因和 17 个单核苷酸多态性(SNP),这些基因和 SNP 具有显著的统计相互作用:5 个用于 PDE4B,4 个用于 RELN,4 个用于 ERBB4,3 个用于 DISC1,1 个用于 NRG1。这些相互作用中涉及的五个 SNP 复制了先前的研究,这五个 SNP 之前被确定为精神分裂症易感性标志物,或暗示与精神分裂症相关的认知过程。这种复制先前工作的能力表明,我们的方法有可能检测出影响精神分裂症大脑异常的基因之间存在有意义的上位性关系。