Suppr超能文献

一种基于Fisher组合函数的针对多变量表型的高效全基因组关联测试。

An efficient genome-wide association test for multivariate phenotypes based on the Fisher combination function.

作者信息

Yang James J, Li Jia, Williams L Keoki, Buu Anne

机构信息

School of Nursing, University of Michigan, Ann Arbor, Michigan, USA.

Department of Public Health Sciences, Henry Ford Health System, Detroit, Michigan, USA.

出版信息

BMC Bioinformatics. 2016 Jan 5;17:19. doi: 10.1186/s12859-015-0868-6.

Abstract

BACKGROUND

In genome-wide association studies (GWAS) for complex diseases, the association between a SNP and each phenotype is usually weak. Combining multiple related phenotypic traits can increase the power of gene search and thus is a practically important area that requires methodology work. This study provides a comprehensive review of existing methods for conducting GWAS on complex diseases with multiple phenotypes including the multivariate analysis of variance (MANOVA), the principal component analysis (PCA), the generalizing estimating equations (GEE), the trait-based association test involving the extended Simes procedure (TATES), and the classical Fisher combination test. We propose a new method that relaxes the unrealistic independence assumption of the classical Fisher combination test and is computationally efficient. To demonstrate applications of the proposed method, we also present the results of statistical analysis on the Study of Addiction: Genetics and Environment (SAGE) data.

RESULTS

Our simulation study shows that the proposed method has higher power than existing methods while controlling for the type I error rate. The GEE and the classical Fisher combination test, on the other hand, do not control the type I error rate and thus are not recommended. In general, the power of the competing methods decreases as the correlation between phenotypes increases. All the methods tend to have lower power when the multivariate phenotypes come from long tailed distributions. The real data analysis also demonstrates that the proposed method allows us to compare the marginal results with the multivariate results and specify which SNPs are specific to a particular phenotype or contribute to the common construct.

CONCLUSIONS

The proposed method outperforms existing methods in most settings and also has great applications in GWAS on complex diseases with multiple phenotypes such as the substance abuse disorders.

摘要

背景

在复杂疾病的全基因组关联研究(GWAS)中,单核苷酸多态性(SNP)与每种表型之间的关联通常较弱。结合多个相关的表型特征可以提高基因搜索的效能,因此是一个需要方法学研究的实际重要领域。本研究全面综述了对具有多个表型的复杂疾病进行GWAS的现有方法,包括多变量方差分析(MANOVA)、主成分分析(PCA)、广义估计方程(GEE)、涉及扩展西姆斯程序的基于性状的关联检验(TATES)以及经典的费舍尔组合检验。我们提出了一种新方法,该方法放宽了经典费舍尔组合检验不切实际的独立性假设,并且计算效率高。为了证明所提出方法的应用,我们还展示了对成瘾:遗传学与环境研究(SAGE)数据的统计分析结果。

结果

我们的模拟研究表明,所提出的方法在控制I型错误率的同时比现有方法具有更高的效能。另一方面,GEE和经典的费舍尔组合检验无法控制I型错误率,因此不建议使用。一般来说,随着表型之间相关性的增加,竞争方法的效能会降低。当多变量表型来自长尾分布时,所有方法的效能往往较低。实际数据分析也表明,所提出的方法使我们能够比较边际结果与多变量结果,并确定哪些SNP对特定表型具有特异性或对共同结构有贡献。

结论

所提出的方法在大多数情况下优于现有方法,并且在对具有多个表型的复杂疾病(如药物滥用障碍)的GWAS中也有很大的应用价值。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a3c/4704475/9b014a223ee5/12859_2015_868_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验