Yang Z, Kumar S, Nei M
Institute of Molecular Evolutionary Genetics, Pennsylvania State University, University Park 16802, USA.
Genetics. 1995 Dec;141(4):1641-50. doi: 10.1093/genetics/141.4.1641.
A statistical method was developed for reconstructing the nucleotide or amino acid sequences of extinct ancestors, given the phylogeny and sequences of the extant species. A model of nucleotide or amino acid substitution was employed to analyze data of the present-day sequences, and maximum likelihood estimates of parameters such as branch lengths were used to compare the posterior probabilities of assignments of character states (nucleotides or amino acids) to interior nodes of the tree; the assignment having the highest probability was the best reconstruction at the site. The lysozyme c sequences of six mammals were analyzed by using the likelihood and parsimony methods. The new likelihood-based method was found to be superior to the parsimony method. The probability that the amino acids for all interior nodes at a site reconstructed by the new method are correct was calculated to be 0.91, 0.86, and 0.73 for all, variable, and parsimony-informative sites, respectively, whereas the corresponding probabilities for the parsimony method were 0.84, 0.76, and 0.51, respectively. The probability that an amino acid in an ancestral sequence is correctly reconstructed by the likelihood analysis ranged from 91.3 to 98.7% for the four ancestral sequences.
给定现存物种的系统发育和序列,开发了一种统计方法来重建已灭绝祖先的核苷酸或氨基酸序列。采用核苷酸或氨基酸替代模型来分析当今序列的数据,并使用诸如分支长度等参数的最大似然估计来比较将字符状态(核苷酸或氨基酸)分配给树内部节点的后验概率;概率最高的分配是该位点的最佳重建。使用似然法和简约法分析了六种哺乳动物的溶菌酶c序列。发现基于似然的新方法优于简约法。通过新方法重建的位点上所有内部节点的氨基酸正确的概率,对于所有位点、可变位点和简约信息位点分别计算为0.91、0.86和0.73,而简约法的相应概率分别为0.84、0.76和0.51。通过似然分析正确重建祖先序列中氨基酸的概率,对于四个祖先序列,范围为91.3%至98.7%。