Pearson William R, Sierk Michael L
Department of Biochemistry and Molecular Genetics, University of Virginia, Charlottesville, VA 22908, USA.
Curr Opin Struct Biol. 2005 Jun;15(3):254-60. doi: 10.1016/j.sbi.2005.05.005.
Modern sequence alignment algorithms are used routinely to identify homologous proteins, proteins that share a common ancestor. Homologous proteins always share similar structures and often have similar functions. Over the past 20 years, sequence comparison has become both more sensitive, largely because of profile-based methods, and more reliable, because of more accurate statistical estimates. As sequence and structure databases become larger, and comparison methods become more powerful, reliable statistical estimates will become even more important for distinguishing similarities that are due to homology from those that are due to analogy (convergence). The newest sequence alignment methods are more sensitive than older methods, but more accurate statistical estimates are needed for their full power to be realized.
现代序列比对算法经常被用于识别同源蛋白质,即拥有共同祖先的蛋白质。同源蛋白质总是具有相似的结构,并且常常具有相似的功能。在过去20年里,序列比较在很大程度上由于基于轮廓的方法而变得更加灵敏,又由于更精确的统计估计而变得更加可靠。随着序列和结构数据库不断增大,以及比较方法变得更加强大,可靠的统计估计对于区分同源相似性和类比(趋同)相似性将变得愈发重要。最新的序列比对方法比旧方法更加灵敏,但要充分发挥其强大功能还需要更精确的统计估计。