Suppr超能文献

一种用于全基因组关联研究途径分析的高效分层广义线性混合模型。

An efficient hierarchical generalized linear mixed model for pathway analysis of genome-wide association studies.

机构信息

Department of Biostatistics, Vanderbilt University, Nashville, TN 37232, USA.

出版信息

Bioinformatics. 2011 Mar 1;27(5):686-92. doi: 10.1093/bioinformatics/btq728. Epub 2011 Jan 25.

Abstract

MOTIVATION

In genome-wide association studies (GWAS) of complex diseases, genetic variants having real but weak associations often fail to be detected at the stringent genome-wide significance level. Pathway analysis, which tests disease association with combined association signals from a group of variants in the same pathway, has become increasingly popular. However, because of the complexities in genetic data and the large sample sizes in typical GWAS, pathway analysis remains to be challenging. We propose a new statistical model for pathway analysis of GWAS. This model includes a fixed effects component that models mean disease association for a group of genes, and a random effects component that models how each gene's association with disease varies about the gene group mean, thus belongs to the class of mixed effects models.

RESULTS

The proposed model is computationally efficient and uses only summary statistics. In addition, it corrects for the presence of overlapping genes and linkage disequilibrium (LD). Via simulated and real GWAS data, we showed our model improved power over currently available pathway analysis methods while preserving type I error rate. Furthermore, using the WTCCC Type 1 Diabetes (T1D) dataset, we demonstrated mixed model analysis identified meaningful biological processes that agreed well with previous reports on T1D. Therefore, the proposed methodology provides an efficient statistical modeling framework for systems analysis of GWAS.

AVAILABILITY

The software code for mixed models analysis is freely available at http://biostat.mc.vanderbilt.edu/LilyWang.

摘要

动机

在复杂疾病的全基因组关联研究(GWAS)中,具有真实但微弱关联的遗传变异通常无法在严格的全基因组显著性水平下被检测到。途径分析是一种越来越受欢迎的方法,它检验了疾病与同一途径中一组变异的联合关联信号的相关性。然而,由于遗传数据的复杂性和典型 GWAS 中的大样本量,途径分析仍然具有挑战性。我们提出了一种新的 GWAS 途径分析的统计模型。该模型包括一个固定效应组件,用于对一组基因的疾病平均关联进行建模;以及一个随机效应组件,用于对每个基因与疾病的关联如何围绕基因组平均值变化进行建模,因此属于混合效应模型的范畴。

结果

所提出的模型计算效率高,仅使用汇总统计信息。此外,它还纠正了重叠基因和连锁不平衡(LD)的存在。通过模拟和真实的 GWAS 数据,我们表明,我们的模型在保持Ⅰ型错误率的同时,提高了现有途径分析方法的功效。此外,使用 WTCCC 1 型糖尿病(T1D)数据集,我们证明了混合模型分析确定了有意义的生物学过程,与之前关于 T1D 的报告一致。因此,所提出的方法为 GWAS 的系统分析提供了一种有效的统计建模框架。

可用性

混合模型分析的软件代码可在 http://biostat.mc.vanderbilt.edu/LilyWang 上免费获取。

相似文献

5
A variable selection method for genome-wide association studies.一种全基因组关联研究的变量选择方法。
Bioinformatics. 2011 Jan 1;27(1):1-8. doi: 10.1093/bioinformatics/btq600. Epub 2010 Oct 29.

引用本文的文献

7
Fast and flexible linear mixed models for genome-wide genetics.快速灵活的全基因组遗传学线性混合模型。
PLoS Genet. 2019 Feb 8;15(2):e1007978. doi: 10.1371/journal.pgen.1007978. eCollection 2019 Feb.

本文引用的文献

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验