Blaisdell B E
J Mol Evol. 1985;22(1):69-81. doi: 10.1007/BF02105807.
The course of evolutionary change in DNA sequences has been modeled as a Markov process. The Markov process was represented by discrete time matrix methods. The parameters of the Markov transition matrices were estimated by least-squares direct-search optimization of the fit of the calculated divergence matrix to that observed for two aligned sequences. The Markov process corrected for multiple and parallel substitutions of bases at the same site. The method avoided the incorrect assumption of all previously described methods that the divergence between two present-day sequences is twice the divergence of either from the common and unknown ancestral sequence. The three previous methods were shown to be equivalent. The present method also avoided the undesirable assumptions that sequence composition has not changed with time and that the substitution rates in the two descendant lineages were the same. It permitted simultaneous estimation of ancestral sequence composition and, if applicable, of different substitution rates for the two descendant lineages, provided the total number of estimated parameters was less than 16. Properties of the Markov chain were discussed. It was proved for symmetric substitution matrices that all elements of the equilibrium divergence matrix equal 1/16, and that the total difference in the divergence matrix at epoch k equals the total change in the common substitution matrix at epoch 2k for all values of k. It was shown how to resolve an ambiguity in the assignment of two different substitution rates to the two descendant lineages when four or more similar sequences are available. The method was applied to the divergence matrix for codon site 3 for the mouse and rabbit beta-globins. This observed divergence matrix was significantly asymmetric and required at least two different substitution rates. This result could be achieved only by using different asymmetric substitution matrices for the two lineages.
DNA序列的进化变化过程已被建模为一个马尔可夫过程。马尔可夫过程由离散时间矩阵方法表示。马尔可夫转移矩阵的参数通过对计算得到的分歧矩阵与两个比对序列观察到的分歧矩阵的拟合进行最小二乘直接搜索优化来估计。马尔可夫过程校正了同一位点碱基的多重和平行替换。该方法避免了所有先前描述的方法的错误假设,即两个现代序列之间的分歧是它们与共同且未知祖先序列分歧的两倍。结果表明,之前的三种方法是等效的。本方法还避免了序列组成未随时间变化以及两个后代谱系中替换率相同的不良假设。如果估计参数的总数小于16,它允许同时估计祖先序列组成以及两个后代谱系的不同替换率(如适用)。讨论了马尔可夫链的性质。对于对称替换矩阵,证明了平衡分歧矩阵的所有元素都等于1/16,并且对于所有k值,第k个时期分歧矩阵的总差异等于第2k个时期共同替换矩阵的总变化。展示了在有四个或更多相似序列时,如何解决将两个不同替换率分配给两个后代谱系时的模糊性问题。该方法应用于小鼠和兔β - 珠蛋白密码子位点3的分歧矩阵。观察到的这个分歧矩阵明显不对称,需要至少两个不同的替换率。只有通过对两个谱系使用不同的不对称替换矩阵才能得到这个结果。