Li Jun, Zhang Kui, Yi Nengjun
Department of Biostatistics, Section on Statistical Genetics, University of Alabama at Birmingham, Birmingham, AL 35294-0022, USA.
Hum Hered. 2011;71(3):148-60. doi: 10.1159/000324841. Epub 2011 Jul 20.
Genetic association studies based on haplotypes are powerful in the discovery and characterization of the genetic basis of complex human diseases. However, statistical methods for detecting haplotype-haplotype and haplotype-environment interactions have not yet been fully developed owing to the difficulties encountered: large numbers of potential haplotypes and unknown haplotype pairs. Furthermore, methods for detecting the association between rare haplotypes and disease have not kept pace with their counterpart of common haplotypes.
METHODS/RESULTS: We herein propose an efficient and robust method to tackle these problems based on a Bayesian hierarchical generalized linear model. Our model simultaneously fits environmental effects, main effects of numerous common and rare haplotypes, and haplotype-haplotype and haplotype-environment interactions. The key to the approach is the use of a continuous prior distribution on coefficients that favors sparseness in the fitted model and facilitates computation. We develop a fast expectation-maximization algorithm to fit models by estimating posterior modes of coefficients. We incorporate our algorithm into the iteratively weighted least squares for classical generalized linear models as implemented in the R package glm. We evaluate the proposed method and compare its performance to existing methods on extensive simulated data.
The results show that the proposed method performs well under all situations and is more powerful than existing approaches.
基于单倍型的基因关联研究在发现和表征复杂人类疾病的遗传基础方面具有强大作用。然而,由于存在大量潜在单倍型和未知单倍型对这些困难,用于检测单倍型 - 单倍型和单倍型 - 环境相互作用的统计方法尚未得到充分发展。此外,检测罕见单倍型与疾病之间关联的方法也未能跟上常见单倍型方法的步伐。
方法/结果:我们在此提出一种基于贝叶斯分层广义线性模型的高效且稳健的方法来解决这些问题。我们的模型同时拟合环境效应、众多常见和罕见单倍型的主效应以及单倍型 - 单倍型和单倍型 - 环境相互作用。该方法的关键在于对系数使用连续先验分布,这有利于拟合模型的稀疏性并便于计算。我们开发了一种快速期望最大化算法,通过估计系数的后验模式来拟合模型。我们将我们的算法纳入R包glm中实现的经典广义线性模型的迭代加权最小二乘法中。我们在大量模拟数据上评估了所提出的方法,并将其性能与现有方法进行了比较。
结果表明,所提出的方法在所有情况下都表现良好,并且比现有方法更具效力。