Gaucher E A, Miyamoto M M, Benner S A
Department of Chemistry and Molecular Cell Biology Program, College of Medicine, University of Florida, Gainesville, FL 32611-7200, USA.
Proc Natl Acad Sci U S A. 2001 Jan 16;98(2):548-52. doi: 10.1073/pnas.98.2.548.
The divergent evolution of protein sequences from genomic databases can be analyzed by the use of different mathematical models. The most common treat all sites in a protein sequence as equally variable. More sophisticated models acknowledge the fact that purifying selection generally tolerates variable amounts of amino acid replacement at different positions in a protein sequence. In their "stationary" versions, such models assume that the replacement rate at individual positions remains constant throughout evolutionary history. "Nonstationary" covarion versions, however, allow the replacement rate at a position to vary in different branches of the evolutionary tree. Recently, statistical methods have been developed that highlight this type of variation in replacement rates. Here, we show how positions that have variable rates of divergence in different regions of a tree ("covarion behavior"), coupled with analyses of experimental three-dimensional structures, can provide experimentally testable hypotheses that relate individual amino acid residues to specific functional differences in those branches. We illustrate this in the elongation factor family of proteins as a paradigm for applications of this type of analysis in functional genomics generally.
通过使用不同的数学模型,可以分析来自基因组数据库的蛋白质序列的趋异进化。最常见的做法是将蛋白质序列中的所有位点视为具有同等变异性。更复杂的模型则承认这样一个事实,即纯化选择通常容忍蛋白质序列中不同位置存在不同数量的氨基酸替换。在其“固定”版本中,此类模型假定各个位置的替换率在整个进化历史中保持恒定。然而,“非固定”的协变模型允许一个位置的替换率在进化树的不同分支中发生变化。最近,已经开发出一些统计方法来突出这种替换率的变化类型。在这里,我们展示了在一棵树的不同区域具有可变分歧率的位置(“协变行为”),再结合对实验性三维结构的分析,如何能够提供可通过实验检验的假设,将单个氨基酸残基与这些分支中的特定功能差异联系起来。我们以蛋白质的延伸因子家族为例进行说明,这是此类分析在功能基因组学中的普遍应用范例。