Suppr超能文献

简化氨基酸替换矩阵在现代蛋白质中发现古代编码字母表的痕迹。

Reduced Amino Acid Substitution Matrices Find Traces of Ancient Coding Alphabets in Modern Day Proteins.

作者信息

Douglas Jordan, Bouckaert Remco, Carter Charles W, Wills Peter R

机构信息

Department of Physics, The University of Auckland, Auckland, New Zealand.

Centre for Computational Evolution, The University of Auckland, Auckland, New Zealand.

出版信息

Mol Biol Evol. 2025 Sep 1;42(9). doi: 10.1093/molbev/msaf197.

Abstract

All known living systems make proteins from the same 20 canonically coded amino acids, but this was not always the case. Early genetic coding systems likely operated with a restricted pool of amino acid types and limited means to distinguish between them. Despite this, amino acid substitution models like LG and WAG all assume a constant coding alphabet over time. That makes them especially inappropriate for the aminoacyl-tRNA synthetases (aaRS)-the enzymes that govern translation. To address this limitation, we created a class of substitution models that account for evolutionary changes in the coding alphabet size by defining the transition from 19 states in a past epoch to 20 now. We use a Bayesian phylogenetic framework to improve phylogeny estimation and testing of this two-alphabet hypothesis. The hypothesis was strongly rejected by datasets composed exclusively of "young" eukaryotic proteins. It was generally supported by "old" (aaRS and non-aaRS) proteins whose origins date from before the last universal common ancestor. Standard methods overestimate the divergence ages of proteins that originated under reduced coding alphabets in both simulated and aaRS alignments. The new model provides a timeline slightly more consistent with the Earth's history. Our findings suggest that aaRS functional bifurcation events can explain much of the genetic code's evolution, but there remain other unknown forces at play too. This work provides a robust, seamless framework for reconstructing phylogenies from ancient protein datasets and offers further insights into the dawn of molecular biology.

摘要

所有已知的生命系统都由相同的20种标准编码氨基酸合成蛋白质,但情况并非一直如此。早期的遗传编码系统可能是在有限的氨基酸类型库和有限的区分手段下运作的。尽管如此,像LG和WAG这样的氨基酸替换模型都假定编码字母表随时间是恒定的。这使得它们特别不适用于氨酰-tRNA合成酶(aaRS)——即控制翻译的酶。为了解决这一局限性,我们创建了一类替换模型,通过定义从过去某个时期的19种状态到现在的20种状态的转变,来解释编码字母表大小的进化变化。我们使用贝叶斯系统发育框架来改进系统发育估计和对这个双字母表假说的检验。该假说被完全由“年轻”的真核生物蛋白质组成的数据集强烈拒绝。它通常得到“古老”(aaRS和非aaRS)蛋白质的支持,这些蛋白质的起源可追溯到最后一个普遍共同祖先之前。在模拟和aaRS比对中,标准方法高估了在编码字母表减少的情况下起源的蛋白质的分歧年龄。新模型提供了一个与地球历史稍更一致的时间线。我们的研究结果表明,aaRS功能分歧事件可以解释遗传密码进化的大部分情况,但也有其他未知力量在起作用。这项工作为从古代蛋白质数据集中重建系统发育提供了一个强大、无缝的框架,并为分子生物学的起源提供了进一步的见解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b55c/12402984/9fc2c48f83e5/msaf197f1.jpg

本文引用的文献

1
Evolution is coupled with branching across many granularities of life.
Proc Biol Sci. 2025 May;292(2047):20250182. doi: 10.1098/rspb.2025.0182. Epub 2025 May 28.
2
Abundant ammonia and nitrogen-rich soluble organic matter in samples from asteroid (101955) Bennu.
Nat Astron. 2025;9(2):199-210. doi: 10.1038/s41550-024-02472-9. Epub 2025 Jan 29.
3
Accurate Bayesian phylogenetic point estimation using a tree distribution parameterized by clade probabilities.
PLoS Comput Biol. 2025 Feb 13;21(2):e1012789. doi: 10.1371/journal.pcbi.1012789. eCollection 2025 Feb.
4
Order of amino acid recruitment into the genetic code resolved by last universal common ancestor's protein domains.
Proc Natl Acad Sci U S A. 2024 Dec 24;121(52):e2410311121. doi: 10.1073/pnas.2410311121. Epub 2024 Dec 12.
5
How to Validate a Bayesian Evolutionary Model.
Syst Biol. 2025 Feb 10;74(1):158-175. doi: 10.1093/sysbio/syae064.
8
AARS Online: A collaborative database on the structure, function, and evolution of the aminoacyl-tRNA synthetases.
IUBMB Life. 2024 Dec;76(12):1091-1105. doi: 10.1002/iub.2911. Epub 2024 Sep 9.
9
The nature of the last universal common ancestor and its impact on the early Earth system.
Nat Ecol Evol. 2024 Sep;8(9):1654-1666. doi: 10.1038/s41559-024-02461-1. Epub 2024 Jul 12.
10
Primordial aminoacyl-tRNA synthetases preferred minihelices to full-length tRNA.
Nucleic Acids Res. 2024 Jul 8;52(12):7096-7111. doi: 10.1093/nar/gkae417.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验