Center for Research in Environmental Epidemiology, Universitat Pompeu Fabra and CIBER Epidemiología y Salud Pública, C/Doctor Aiguader 88, Barcelona, Spain.
Department of Systems Biology, Bioinformatics and Medical Statistics Group, Universitat de Vic-Universitat Central de Catalunya, C. Sagrada Familia 7, Vic, Spain.
IEEE/ACM Trans Comput Biol Bioinform. 2016 Nov;13(6):1100-1106. doi: 10.1109/TCBB.2015.2509977. Epub 2015 Dec 22.
The goal of Genome-wide Association Studies (GWAS) is the identification of genetic variants, usually single nucleotide polymorphisms (SNPs), that are associated with disease risk. However, SNPs detected so far with GWAS for most common diseases only explain a small proportion of their total heritability. Gene set analysis (GSA) has been proposed as an alternative to single-SNP analysis with the aim of improving the power of genetic association studies. Nevertheless, most GSA methods rely on expensive computational procedures that make unfeasible their implementation in GWAS. We propose a new GSA method, referred as globalEVT, which uses the extreme value theory to derive gene-level p-values. GlobalEVT reduces dramatically the computational requirements compared to other GSA approaches. In addition, this new approach improves the power by allowing different inheritance models for each genetic variant as illustrated in the simulation study performed and allows the existence of correlation between the SNPs. Real data analysis of an Attention-deficit/hyperactivity disorder (ADHD) study illustrates the importance of using GSA approaches for exploring new susceptibility genes. Specifically, the globalEVT method is able to detect genes related to Cyclophilin A like domain proteins which is known to play an important role in the mechanisms of ADHD development.
全基因组关联研究(GWAS)的目的是确定与疾病风险相关的遗传变异,通常是单核苷酸多态性(SNP)。然而,迄今为止,GWAS 检测到的大多数常见疾病的 SNP 仅能解释其总遗传率的一小部分。基因集分析(GSA)已被提议作为单 SNP 分析的替代方法,旨在提高遗传关联研究的效能。然而,大多数 GSA 方法依赖于昂贵的计算程序,使其在 GWAS 中无法实施。我们提出了一种新的 GSA 方法,称为 globalEVT,它使用极值理论来推导基因水平的 p 值。与其他 GSA 方法相比,globalEVT 大大降低了计算要求。此外,这种新方法通过允许每个遗传变异体采用不同的遗传模型来提高效能,如模拟研究所示,并允许 SNP 之间存在相关性。对注意力缺陷/多动障碍(ADHD)研究的真实数据分析说明了使用 GSA 方法探索新的易感基因的重要性。具体来说,globalEVT 方法能够检测到与亲环素 A 样域蛋白相关的基因,已知该蛋白在 ADHD 发展的机制中起着重要作用。