Tan Qihua, Christiansen Lene, Christensen Kaare, Bathum Lise, Li Shuxia, Zhao Jing Hua, Kruse Torben A
Department of Clinical Biochemistry and Genetics (KKA), Odense University Hospital, Denmark.
Genet Res. 2005 Dec;86(3):223-31. doi: 10.1017/S0016672305007792.
Haplotype inference has become an important part of human genetic data analysis due to its functional and statistical advantages over the single-locus approach in linkage disequilibrium mapping. Different statistical methods have been proposed for detecting haplotype - disease associations using unphased multi-locus genotype data, ranging from the early approach by the simple gene-counting method to the recent work using the generalized linear model. However, these methods are either confined to case - control design or unable to yield unbiased point and interval estimates of haplotype effects. Based on the popular logistic regression model, we present a new approach for haplotype association analysis of human disease traits. Using haplotype-based parameterization, our model infers the effects of specific haplotypes (point estimation) and constructs confidence interval for the risks of haplotypes (interval estimation). Based on the estimated parameters, the model calculates haplotype frequency conditional on the trait value for both discrete and continuous traits. Moreover, our model provides an overall significance level for the association between the disease trait and a group or all of the haplotypes. Featured by the direct maximization in haplotype estimation, our method also facilitates a computer simulation approach for correcting the significance level of individual haplotype to adjust for multiple testing. We show, by applying the model to an empirical data set, that our method based on the well-known logistic regression model is a useful tool for haplotype association analysis of human disease traits.
由于单倍型在连锁不平衡作图中相对于单基因座方法具有功能和统计优势,单倍型推断已成为人类遗传数据分析的重要组成部分。已经提出了不同的统计方法,用于使用未分型的多基因座基因型数据检测单倍型与疾病的关联,从早期的简单基因计数法到最近使用广义线性模型的工作。然而,这些方法要么局限于病例对照设计,要么无法产生无偏的单倍型效应点估计和区间估计。基于流行的逻辑回归模型,我们提出了一种用于人类疾病性状单倍型关联分析的新方法。使用基于单倍型的参数化,我们的模型推断特定单倍型的效应(点估计),并构建单倍型风险的置信区间(区间估计)。基于估计的参数,该模型计算离散和连续性状在性状值条件下的单倍型频率。此外,我们的模型为疾病性状与一组或所有单倍型之间的关联提供了一个总体显著性水平。以单倍型估计中的直接最大化特征,我们的方法还促进了一种计算机模拟方法,用于校正单个单倍型的显著性水平以调整多重检验。通过将该模型应用于一个经验数据集,我们表明基于著名逻辑回归模型的方法是人类疾病性状单倍型关联分析的有用工具。