Suppr超能文献

GWIS--无模型、快速且全面搜索病例对照 GWAS 中的上位相互作用。

GWIS--model-free, fast and exhaustive search for epistatic interactions in case-control GWAS.

机构信息

National ICT Australia Victorian Research Lab, The University of Melbourne, Parkville, Victoria, Australia.

出版信息

BMC Genomics. 2013;14 Suppl 3(Suppl 3):S10. doi: 10.1186/1471-2164-14-S3-S10. Epub 2013 May 28.

Abstract

BACKGROUND

It has been hypothesized that multivariate analysis and systematic detection of epistatic interactions between explanatory genotyping variables may help resolve the problem of "missing heritability" currently observed in genome-wide association studies (GWAS). However, even the simplest bivariate analysis is still held back by significant statistical and computational challenges that are often addressed by reducing the set of analysed markers. Theoretically, it has been shown that combinations of loci may exist that show weak or no effects individually, but show significant (even complete) explanatory power over phenotype when combined. Reducing the set of analysed SNPs before bivariate analysis could easily omit such critical loci.

RESULTS

We have developed an exhaustive bivariate GWAS analysis methodology that yields a manageable subset of candidate marker pairs for subsequent analysis using other, often more computationally expensive techniques. Our model-free filtering approach is based on classification using ROC curve analysis, an alternative to much slower regression-based modelling techniques. Exhaustive analysis of studies containing approximately 450,000 SNPs and 5,000 samples requires only 2 hours using a desktop CPU or 13 minutes using a GPU (Graphics Processing Unit). We validate our methodology with analysis of simulated datasets as well as the seven Wellcome Trust Case-Control Consortium datasets that represent a wide range of real life GWAS challenges. We have identified SNP pairs that have considerably stronger association with disease than their individual component SNPs that often show negligible effect univariately. When compared against previously reported results in the literature, our methods re-detect most significant SNP-pairs and additionally detect many pairs absent from the literature that show strong association with disease. The high overlap suggests that our fast analysis could substitute for some slower alternatives.

CONCLUSIONS

We demonstrate that the proposed methodology is robust, fast and capable of exhaustive search for epistatic interactions using a standard desktop computer. First, our implementation is significantly faster than timings for comparable algorithms reported in the literature, especially as our method allows simultaneous use of multiple statistical filters with low computing time overhead. Second, for some diseases, we have identified hundreds of SNP pairs that pass formal multiple test (Bonferroni) correction and could form a rich source of hypotheses for follow-up analysis.

AVAILABILITY

A web-based version of the software used for this analysis is available at http://bioinformatics.research.nicta.com.au/gwis.

摘要

背景

有人假设,通过对解释性基因分型变量进行多元分析和系统检测,可能有助于解决目前在全基因组关联研究(GWAS)中观察到的“遗传缺失”问题。然而,即使是最简单的双变量分析仍然受到重大统计和计算挑战的限制,这些挑战通常通过减少分析标记的数量来解决。从理论上讲,已经表明可能存在这样的组合位点,它们单独显示出微弱或没有作用,但当组合在一起时,对表型显示出显著(甚至完全)的解释力。在进行双变量分析之前,减少分析的 SNP 数量很容易遗漏这些关键的位点。

结果

我们开发了一种详尽的双变量 GWAS 分析方法,该方法产生了一个可管理的候选标记对子集,以便随后使用其他、通常更昂贵的计算技术进行分析。我们的无模型过滤方法基于使用 ROC 曲线分析的分类,这是替代速度较慢的基于回归建模技术的一种方法。使用桌面 CPU 仅需 2 小时,使用 GPU(图形处理单元)仅需 13 分钟即可对包含大约 450,000 个 SNP 和 5,000 个样本的研究进行详尽分析。我们使用模拟数据集以及代表广泛的真实 GWAS 挑战的七个威康信托基金会病例对照联盟数据集对我们的方法进行了验证。我们已经确定了 SNP 对与疾病的关联比其单个组成 SNP 强得多,而这些 SNP 单独进行单变量分析时几乎没有影响。与文献中以前报道的结果相比,我们的方法重新检测到大多数具有显著相关性的 SNP 对,并且还检测到许多文献中没有的与疾病强烈相关的 SNP 对。高重叠表明,我们的快速分析可以替代一些较慢的替代方法。

结论

我们证明了所提出的方法是稳健的、快速的,并且能够使用标准的桌面计算机进行详尽的搜索,以寻找上位性相互作用。首先,我们的实现速度明显快于文献中报道的类似算法的时间,尤其是因为我们的方法允许同时使用多个统计滤波器,计算时间开销低。其次,对于一些疾病,我们已经确定了数百个通过正式多重测试(Bonferroni)校正的 SNP 对,它们可以成为后续分析的丰富假设来源。

可用性

可在 http://bioinformatics.research.nicta.com.au/gwis 上访问用于此分析的软件的基于网络的版本。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e752/3665501/513f78fb0571/1471-2164-14-S3-S10-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验