Alvarez Sophia, Nartey Charisse M, Mercado Nicholas, de la Paz Alberto, Huseinbegovic Tea, Morcos Faruck
Department of Biological Sciences, University of Texas at Dallas, Richardson, TX 75080, USA.
School of Natural Sciences and Mathematics, University of Texas at Dallas, Richardson, TX 75080, USA.
bioRxiv. 2023 May 25:2023.05.24.542176. doi: 10.1101/2023.05.24.542176.
Computational models of evolution are valuable for understanding the dynamics of sequence variation, to infer phylogenetic relationships or potential evolutionary pathways and for biomedical and industrial applications. Despite these benefits, few have validated their propensities to generate outputs with functionality, which would enhance their value as accurate and interpretable evolutionary algorithms. We demonstrate the power of epistasis inferred from natural protein families to evolve sequence variants in an algorithm we developed called Sequence Evolution with Epistatic Contributions. Utilizing the Hamiltonian of the joint probability of sequences in the family as fitness metric, we sampled and experimentally tested for -lactamase activity in TEM-1 variants. These evolved proteins can have dozens of mutations dispersed across the structure while preserving sites essential for both catalysis and interactions. Remarkably, these variants retain family-like functionality while being more active than their WT predecessor. We found that depending on the inference method used to generate the epistatic constraints, different parameters simulate diverse selection strengths. Under weaker selection, local Hamiltonian fluctuations reliably predict relative changes to variant fitness, recapitulating neutral evolution. SEEC has the potential to explore the dynamics of neofunctionalization, characterize viral fitness landscapes and facilitate vaccine development.
进化计算模型对于理解序列变异的动态、推断系统发育关系或潜在的进化途径以及生物医学和工业应用都具有重要价值。尽管有这些好处,但很少有人验证过它们生成具有功能输出的倾向,而这将提升它们作为准确且可解释的进化算法的价值。我们展示了从天然蛋白质家族推断出的上位性在我们开发的一种名为“具有上位性贡献的序列进化”(Sequence Evolution with Epistatic Contributions)的算法中对进化序列变异的强大作用。利用家族中序列联合概率的哈密顿量作为适应度指标,我们对TEM - 1变体进行采样并实验测试其β - 内酰胺酶活性。这些进化后的蛋白质在整个结构中可能有数十个突变,同时保留了催化和相互作用所必需的位点。值得注意的是,这些变体保留了类似家族的功能,同时比其野生型前身更具活性。我们发现,根据用于生成上位性约束的推断方法,不同参数模拟了不同的选择强度。在较弱的选择下,局部哈密顿量波动可靠地预测了变体适应度的相对变化,概括了中性进化。“具有上位性贡献的序列进化”有潜力探索新功能化的动态、表征病毒适应度景观并促进疫苗开发。