Chen Yi-Hau, Kao Jau-Tsuen
Institute of Statistical Science, Academia Sinica, Taipei 11529, Taiwan.
BMC Genet. 2006 Aug 15;7:43. doi: 10.1186/1471-2156-7-43.
The genetic association analysis using haplotypes as basic genetic units is anticipated to be a powerful strategy towards the discovery of genes predisposing human complex diseases. In particular, the increasing availability of high-resolution genetic markers such as the single-nucleotide polymorphisms (SNPs) has made haplotype-based association analysis an attractive alternative to single marker analysis.
We consider haplotype association analysis under the population-based case-control study design. A multinomial logistic model is proposed for haplotype analysis with unphased genotype data, which can be decomposed into a prospective logistic model for disease risk as well as a model for the haplotype-pair distribution in the control population. Environmental factors can be readily incorporated and hence the haplotype-environment interaction can be assessed in the proposed model. The maximum likelihood estimation with unphased genotype data can be conveniently implemented in the proposed model by applying the EM algorithm to a prospective multinomial logistic regression model and ignoring the case-control design. We apply the proposed method to the hypertriglyceridemia study and identifies 3 haplotypes in the apolipoprotein A5 gene that are associated with increased risk for hypertriglyceridemia. A haplotype-age interaction effect is also identified. Simulation studies show that the proposed estimator has satisfactory finite-sample performances.
Our results suggest that the proposed method can serve as a useful alternative to existing methods and a reliable tool for the case-control haplotype-based association analysis.
以单倍型作为基本遗传单位进行基因关联分析有望成为发现人类复杂疾病易感基因的有力策略。特别是,单核苷酸多态性(SNP)等高分辨率遗传标记的可得性不断增加,使得基于单倍型的关联分析成为单标记分析的一个有吸引力的替代方法。
我们考虑在基于人群的病例对照研究设计下进行单倍型关联分析。提出了一种用于无相位基因型数据单倍型分析的多项逻辑模型,该模型可分解为疾病风险的前瞻性逻辑模型以及对照人群中单倍型对分布的模型。环境因素可以很容易地纳入,因此在所提出的模型中可以评估单倍型与环境的相互作用。通过将期望最大化(EM)算法应用于前瞻性多项逻辑回归模型并忽略病例对照设计,可以在所提出的模型中方便地实现无相位基因型数据的最大似然估计。我们将所提出的方法应用于高甘油三酯血症研究,并在载脂蛋白A5基因中鉴定出3种与高甘油三酯血症风险增加相关的单倍型。还鉴定出了单倍型与年龄的相互作用效应。模拟研究表明,所提出的估计器具有令人满意的有限样本性能。
我们的结果表明,所提出的方法可以作为现有方法的有用替代方法,以及基于病例对照单倍型关联分析的可靠工具。