Do Chuong B, Mahabhashyam Mahathi S P, Brudno Michael, Batzoglou Serafim
Department of Computer Science, Stanford University, Stanford, California 94305, USA.
Genome Res. 2005 Feb;15(2):330-40. doi: 10.1101/gr.2821705.
To study gene evolution across a wide range of organisms, biologists need accurate tools for multiple sequence alignment of protein families. Obtaining accurate alignments, however, is a difficult computational problem because of not only the high computational cost but also the lack of proper objective functions for measuring alignment quality. In this paper, we introduce probabilistic consistency, a novel scoring function for multiple sequence comparisons. We present ProbCons, a practical tool for progressive protein multiple sequence alignment based on probabilistic consistency, and evaluate its performance on several standard alignment benchmark data sets. On the BAliBASE, SABmark, and PREFAB benchmark alignment databases, ProbCons achieves statistically significant improvement over other leading methods while maintaining practical speed. ProbCons is publicly available as a Web resource.
为了研究广泛生物体中的基因进化,生物学家需要用于蛋白质家族多序列比对的精确工具。然而,获得精确的比对是一个困难的计算问题,这不仅是因为计算成本高,还因为缺乏用于衡量比对质量的合适目标函数。在本文中,我们引入了概率一致性,这是一种用于多序列比较的新型评分函数。我们展示了ProbCons,这是一种基于概率一致性的用于渐进式蛋白质多序列比对的实用工具,并在几个标准比对基准数据集上评估了它的性能。在BAliBASE、SABmark和PREFAB基准比对数据库上,ProbCons在保持实用速度的同时,相对于其他领先方法实现了具有统计学意义的改进。ProbCons作为一种网络资源可公开获取。