Department of Biological Sciences, University of Texas at Dallas, Richardson, TX 75080.
Department of Bioengineering, University of Texas at Dallas, Richardson, TX 75080.
Proc Natl Acad Sci U S A. 2024 Feb 6;121(6):e2308895121. doi: 10.1073/pnas.2308895121. Epub 2024 Jan 29.
Computational models of evolution are valuable for understanding the dynamics of sequence variation, to infer phylogenetic relationships or potential evolutionary pathways and for biomedical and industrial applications. Despite these benefits, few have validated their propensities to generate outputs with in vivo functionality, which would enhance their value as accurate and interpretable evolutionary algorithms. We demonstrate the power of epistasis inferred from natural protein families to evolve sequence variants in an algorithm we developed called sequence evolution with epistatic contributions (SEEC). Utilizing the Hamiltonian of the joint probability of sequences in the family as fitness metric, we sampled and experimentally tested for in vivo [Formula: see text]-lactamase activity in TEM-1 variants. These evolved proteins can have dozens of mutations dispersed across the structure while preserving sites essential for both catalysis and interactions. Remarkably, these variants retain family-like functionality while being more active than their wild-type predecessor. We found that depending on the inference method used to generate the epistatic constraints, different parameters simulate diverse selection strengths. Under weaker selection, local Hamiltonian fluctuations reliably predict relative changes to variant fitness, recapitulating neutral evolution. SEEC has the potential to explore the dynamics of neofunctionalization, characterize viral fitness landscapes, and facilitate vaccine development.
进化的计算模型对于理解序列变异的动态、推断系统发育关系或潜在的进化途径以及用于生物医学和工业应用都是非常有价值的。尽管有这些好处,但很少有人验证它们生成具有体内功能的输出的倾向,这将提高它们作为准确和可解释的进化算法的价值。我们展示了从自然蛋白质家族推断出的上位性在我们开发的一种称为具有上位性贡献的序列进化 (SEEC) 的算法中进化序列变异的能力。我们利用家族中序列联合概率的哈密顿量作为适应度度量,对 TEM-1 变体进行了抽样和实验测试,以检测体内 [Formula: see text]-内酰胺酶活性。这些进化的蛋白质可以在结构中分散几十个突变,同时保留对催化和相互作用都很重要的位点。值得注意的是,这些变体保留了家族样的功能,同时比其野生型前体更具活性。我们发现,根据用于生成上位性约束的推断方法,不同的参数模拟了不同的选择强度。在较弱的选择下,局部哈密顿量波动可靠地预测了变体适应度的相对变化,再现了中性进化。SEEC 具有探索新功能化动态、表征病毒适应度景观和促进疫苗开发的潜力。