Center for Theoretical Biophysics, Rice University, Houston, TX 77005.
Applied Physics Graduate Program, Smalley-Curl Institute, Rice University, Houston, TX 77005.
Proc Natl Acad Sci U S A. 2024 May 21;121(21):e2322428121. doi: 10.1073/pnas.2322428121. Epub 2024 May 13.
Protein evolution is guided by structural, functional, and dynamical constraints ensuring organismal viability. Pseudogenes are genomic sequences identified in many eukaryotes that lack translational activity due to sequence degradation and thus over time have undergone "devolution." Previously pseudogenized genes sometimes regain their protein-coding function, suggesting they may still encode robust folding energy landscapes despite multiple mutations. We study both the physical folding landscapes of protein sequences corresponding to human pseudogenes using the Associative Memory, Water Mediated, Structure and Energy Model, and the evolutionary energy landscapes obtained using direct coupling analysis (DCA) on their parent protein families. We found that generally mutations that have occurred in pseudogene sequences have disrupted their native global network of stabilizing residue interactions, making it harder for them to fold if they were translated. In some cases, however, energetic frustration has apparently decreased when the functional constraints were removed. We analyzed this unexpected situation for Cyclophilin A, Profilin-1, and Small Ubiquitin-like Modifier 2 Protein. Our analysis reveals that when such mutations in the pseudogene ultimately stabilize folding, at the same time, they likely alter the pseudogenes' former biological activity, as estimated by DCA. We localize most of these stabilizing mutations generally to normally frustrated regions required for binding to other partners.
蛋白质进化受到结构、功能和动力学约束的指导,以确保生物的生存能力。假基因是在许多真核生物中发现的基因组序列,由于序列退化而缺乏翻译活性,因此随着时间的推移经历了“退化”。以前,假基因化的基因有时会恢复其蛋白质编码功能,这表明尽管发生了多次突变,它们可能仍然编码强大的折叠能量景观。我们使用关联记忆、水介导、结构和能量模型研究了对应于人类假基因的蛋白质序列的物理折叠景观,以及使用直接耦合分析(DCA)在其亲本蛋白质家族上获得的进化能量景观。我们发现,一般来说,发生在假基因序列中的突变破坏了它们天然的稳定残基相互作用的全局网络,如果这些序列被翻译,它们就更难折叠。然而,在某些情况下,当功能约束被去除时,能量的受挫显然减少了。我们分析了亲环素 A、原肌球蛋白 1 和小泛素样修饰物 2 蛋白的这种意外情况。我们的分析表明,当这些假基因中的突变最终稳定折叠时,同时,它们可能会改变假基因的前生物活性,正如 DCA 所估计的那样。我们将这些稳定突变的大部分定位到通常需要与其他伙伴结合的受挫区域。