Liu Jianfeng, Pei Yufang, Papasian Chris J, Deng Hong-Wen
Department of Orthopedic Surgery, School of Medicine, University of Missouri-Kansas City, Kansas City, Missouri, USA.
Genet Epidemiol. 2009 Apr;33(3):217-27. doi: 10.1002/gepi.20372.
Genome-wide association (GWA) study is becoming a powerful tool in deciphering genetic basis of complex human diseases/traits. Currently, the univariate analysis is the most commonly used method to identify genes associated with a certain disease/phenotype under study. A major limitation with the univariate analysis is that it may not make use of the information of multiple correlated phenotypes, which are usually measured and collected in practical studies. The multivariate analysis has proven to be a powerful approach in linkage studies of complex diseases/traits, but it has received little attention in GWA. In this study, we aim to develop a bivariate analytical method for GWA study, which can be used for a complex situation in which continuous trait and a binary trait are measured under study. Based on the modified extended generalized estimating equation (EGEE) method we proposed herein, we assessed the performance of our bivariate analyses through extensive simulations as well as real data analyses. In the study, to develop an EGEE approach for bivariate genetic analyses, we combined two different generalized linear models corresponding to phenotypic variables using a seemingly unrelated regression model. The simulation results demonstrated that our EGEE-based bivariate analytical method outperforms univariate analyses in increasing statistical power under a variety of simulation scenarios. Notably, EGEE-based bivariate analyses have consistent advantages over univariate analyses whether or not there exists a phenotypic correlation between the two traits. Our study has practical importance, as one can always use multivariate analyses as a screening tool when multiple phenotypes are available, without extra costs of statistical power and false-positive rate. Analyses on empirical GWA data further affirm the advantages of our bivariate analytical method.
全基因组关联(GWA)研究正成为解读复杂人类疾病/性状遗传基础的有力工具。目前,单变量分析是识别与所研究的特定疾病/表型相关基因的最常用方法。单变量分析的一个主要局限性在于,它可能无法利用多个相关表型的信息,而这些信息在实际研究中通常会被测量和收集。多变量分析已被证明是复杂疾病/性状连锁研究中的一种有力方法,但在GWA研究中却很少受到关注。在本研究中,我们旨在开发一种用于GWA研究的双变量分析方法,该方法可用于研究中同时测量连续性状和二元性状的复杂情况。基于我们在此提出的改进扩展广义估计方程(EGEE)方法,我们通过广泛的模拟以及实际数据分析评估了双变量分析的性能。在该研究中,为了开发用于双变量遗传分析的EGEE方法,我们使用看似不相关回归模型将对应于表型变量的两个不同广义线性模型进行了合并。模拟结果表明,我们基于EGEE的双变量分析方法在各种模拟场景下,在提高统计功效方面优于单变量分析。值得注意的是,无论两个性状之间是否存在表型相关性,基于EGEE的双变量分析相对于单变量分析都具有一致的优势。我们的研究具有实际重要性,因为当有多个表型可用时,人们总是可以将多变量分析用作筛选工具,而不会增加统计功效和假阳性率方面的额外成本。对实际GWA数据的分析进一步证实了我们双变量分析方法的优势。