Eick Geeta N, Bridgham Jamie T, Anderson Douglas P, Harms Michael J, Thornton Joseph W
Institute of Ecology & Evolutionary Biology, University of Oregon, Eugene, OR.
Department of Anthropology, University of Oregon, Eugene, OR.
Mol Biol Evol. 2017 Feb 1;34(2):247-261. doi: 10.1093/molbev/msw223.
Hypotheses about the functions of ancient proteins and the effects of historical mutations on them are often tested using ancestral protein reconstruction (APR)-phylogenetic inference of ancestral sequences followed by synthesis and experimental characterization. Usually, some sequence sites are ambiguously reconstructed, with two or more statistically plausible states. The extent to which the inferred functions and mutational effects are robust to uncertainty about the ancestral sequence has not been studied systematically. To address this issue, we reconstructed ancestral proteins in three domain families that have different functions, architectures, and degrees of uncertainty; we then experimentally characterized the functional robustness of these proteins when uncertainty was incorporated using several approaches, including sampling amino acid states from the posterior distribution at each site and incorporating the alternative amino acid state at every ambiguous site in the sequence into a single "worst plausible case" protein. In every case, qualitative conclusions about the ancestral proteins' functions and the effects of key historical mutations were robust to sequence uncertainty, with similar functions observed even when scores of alternate amino acids were incorporated. There was some variation in quantitative descriptors of function among plausible sequences, suggesting that experimentally characterizing robustness is particularly important when quantitative estimates of ancient biochemical parameters are desired. The worst plausible case method appears to provide an efficient strategy for characterizing the functional robustness of ancestral proteins to large amounts of sequence uncertainty. Sampling from the posterior distribution sometimes produced artifactually nonfunctional proteins for sequences reconstructed with substantial ambiguity.
关于古代蛋白质功能以及历史突变对其影响的假设,通常通过祖先蛋白质重建(APR)来检验,即对祖先序列进行系统发育推断,随后进行合成及实验表征。通常情况下,一些序列位点的重建存在歧义,有两种或更多种统计学上合理的状态。然而,推断出的功能和突变效应对于祖先序列不确定性的稳健程度尚未得到系统研究。为解决这一问题,我们在具有不同功能、结构和不确定程度的三个结构域家族中重建了祖先蛋白质;然后,我们采用多种方法将不确定性纳入其中,对这些蛋白质的功能稳健性进行了实验表征,这些方法包括从每个位点的后验分布中采样氨基酸状态,以及将序列中每个歧义位点的替代氨基酸状态纳入单个“最糟合理情况”蛋白质中。在每种情况下,关于祖先蛋白质功能以及关键历史突变效应的定性结论对于序列不确定性都是稳健的,即使纳入了大量替代氨基酸,仍观察到相似的功能。在合理序列之间,功能的定量描述存在一些差异,这表明当需要对古代生化参数进行定量估计时,实验表征稳健性尤为重要。最糟合理情况方法似乎为表征祖先蛋白质对大量序列不确定性的功能稳健性提供了一种有效策略。从后验分布中采样有时会为具有大量歧义性重建的序列产生人为的无功能蛋白质。