Zhang Kui, Deng Minghua, Chen Ting, Waterman Michael S, Sun Fengzhu
Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, 1042 West 36th Place, DRB142, Los Angeles, CA 90089-1113, USA.
Proc Natl Acad Sci U S A. 2002 May 28;99(11):7335-9. doi: 10.1073/pnas.102186799.
We develop a dynamic programming algorithm for haplotype block partitioning to minimize the number of representative single nucleotide polymorphisms (SNPs) required to account for most of the common haplotypes in each block. Any measure of haplotype quality can be used in the algorithm and of course the measure should depend on the specific application. The dynamic programming algorithm is applied to analyze the chromosome 21 haplotype data of Patil et al. [Patil, N., Berno, A. J., Hinds, D. A., Barrett, W. A., Doshi, J. M., Hacker, C. R., Kautzer, C. R., Lee, D. H., Marjoribanks, C., McDonough, D. P., et al. (2001) Science 294, 1719-1723], who searched for blocks of limited haplotype diversity. Using the same criteria as in Patil et al., we identify a total of 3,582 representative SNPs and 2,575 blocks that are 21.5% and 37.7% smaller, respectively, than those identified using a greedy algorithm of Patil et al. We also apply the dynamic programming algorithm to the same data set based on haplotype diversity. A total of 3,982 representative SNPs and 1,884 blocks are identified to account for 95% of the haplotype diversity in each block.
我们开发了一种用于单倍型块划分的动态规划算法,以尽量减少解释每个块中大多数常见单倍型所需的代表性单核苷酸多态性(SNP)数量。算法中可以使用任何单倍型质量度量,当然该度量应取决于具体应用。应用动态规划算法分析了帕蒂尔等人[帕蒂尔,N.,贝尔诺,A. J.,欣兹,D. A.,巴雷特,W. A.,多希,J. M.,哈克,C. R.,考泽,C. R.,李,D. H.,马乔里班克斯,C.,麦克多诺,D. P.等人(2001年)《科学》294卷,1719 - 1723页]的21号染色体单倍型数据,他们寻找单倍型多样性有限的块。使用与帕蒂尔等人相同的标准,我们总共识别出3582个代表性SNP和2575个块,分别比使用帕蒂尔等人的贪心算法识别出的少21.5%和37.7%。我们还基于单倍型多样性将动态规划算法应用于同一数据集。总共识别出3982个代表性SNP和1884个块,以解释每个块中95%的单倍型多样性。