Chen Zhixiang, Fu Bin, Schweller Robert, Yang Boting, Zhao Zhiyu, Zhu Binhai
Department of Computer Science, University of Texas-Pan American, Edinburg, Texas 78539, USA.
J Comput Biol. 2008 Jun;15(5):535-46. doi: 10.1089/cmb.2008.0003.
In this paper, we develop a probabilistic model to approach two realistic scenarios regarding the singular haplotype reconstruction problem--the incompleteness and inconsistency that occurred in the DNA sequencing process to generate the input haplotype fragments, and the common practice used to generate synthetic data in experimental algorithm studies. We design three algorithms in the model that can reconstruct the two unknown haplotypes from the given matrix of haplotype fragments with provable high probability and in linear time in the size of the input matrix. We also present experimental results that conform with the theoretical efficient performance of those algorithms. The software of our algorithms is available for public access and for real-time on-line demonstration.
在本文中,我们开发了一种概率模型,以处理关于单倍型重建问题的两个现实场景——在生成输入单倍型片段的DNA测序过程中出现的不完整性和不一致性,以及在实验算法研究中用于生成合成数据的常见做法。我们在该模型中设计了三种算法,它们能够以可证明的高概率且在输入矩阵大小的线性时间内,从给定的单倍型片段矩阵中重建两个未知的单倍型。我们还展示了与这些算法的理论高效性能相符的实验结果。我们算法的软件可供公众访问并进行实时在线演示。