Cai Zhipeng, Sabaa Hadi, Wang Yining, Goebel Randy, Wang Zhiquan, Xu Jiaofen, Stothard Paul, Lin Guohui
Department of Computing Science, University of Alberta, Edmonton, Alberta, Canada.
BMC Bioinformatics. 2009 Apr 21;10:115. doi: 10.1186/1471-2105-10-115.
The "common disease--common variant" hypothesis and genome-wide association studies have achieved numerous successes in the last three years, particularly in genetic mapping in human diseases. Nevertheless, the power of the association study methods are still low, in particular on quantitative traits, and the description of the full allelic spectrum is deemed still far from reach. Given increasing density of single nucleotide polymorphisms available and suggested by the block-like structure of the human genome, a popular and prosperous strategy is to use haplotypes to try to capture the correlation structure of SNPs in regions of little recombination. The key to the success of this strategy is thus the ability to unambiguously determine the haplotype allele sharing status among the members. The association studies based on haplotype sharing status would have significantly reduced degrees of freedom and be able to capture the combined effects of tightly linked causal variants.
For pedigree genotype datasets of medium density of SNPs, we present two methods for haplotype allele sharing status determination among the pedigree members. Extensive simulation study showed that both methods performed nearly perfectly on breakpoint discovery, mutation haplotype allele discovery, and shared chromosomal region discovery.
For pedigree genotype datasets, the haplotype allele sharing status among the members can be deterministically, efficiently, and accurately determined, even for very small pedigrees. Given their excellent performance, the presented haplotype allele sharing status determination programs can be useful in many downstream applications including haplotype based association studies.
“常见疾病 - 常见变异”假说和全基因组关联研究在过去三年中取得了众多成果,尤其是在人类疾病的基因定位方面。然而,关联研究方法的效能仍然较低,特别是对于数量性状而言,而且对完整等位基因谱的描述仍远未实现。鉴于可用单核苷酸多态性的密度不断增加以及人类基因组的块状结构所暗示的情况,一种流行且繁荣的策略是使用单倍型来尝试捕捉重组较少区域中SNP的相关结构。因此,该策略成功的关键在于能够明确确定成员之间的单倍型等位基因共享状态。基于单倍型共享状态的关联研究将显著减少自由度,并能够捕捉紧密连锁的因果变异的联合效应。
对于中等密度SNP的家系基因型数据集,我们提出了两种确定家系成员之间单倍型等位基因共享状态的方法。广泛的模拟研究表明,这两种方法在断点发现、突变单倍型等位基因发现和共享染色体区域发现方面都表现得近乎完美。
对于家系基因型数据集,即使是非常小的家系,也能够确定性地、高效且准确地确定成员之间的单倍型等位基因共享状态。鉴于其出色的性能,所提出的单倍型等位基因共享状态确定程序可用于许多下游应用,包括基于单倍型的关联研究。