Department of Biological Sciences, University of Maryland, Baltimore County, 1000 Hilltop Circle, Baltimore, MD 25250, USA.
Department of Biological Sciences, University of Maryland, Baltimore County, 1000 Hilltop Circle, Baltimore, MD 25250, USA.
J Theor Biol. 2021 Jul 7;520:110661. doi: 10.1016/j.jtbi.2021.110661. Epub 2021 Mar 5.
A series of prior publications has reported unusual properties of the set of genetically encoded amino acids shared by all known life. This work uses quantitative measures (descriptors) of size, charge and hydrophobicity to compare the distribution of the genetically encoded amino acids with random samples of plausible alternatives. Results show that the standard "alphabet" of amino acids established by the time of LUCA is distributed with unusual evenness over a broad range for the three, key physicochemical properties. However, different publications have used slightly different assumptions, including variations in the precise descriptors used, the set of plausible alternative molecules considered, and the format in which results have been presented. Here we consolidate these findings into a unified framework in order to clarify unusual features. We find that in general, the remarkable features of the full set of 20 genetically encoded amino acids are robust when compared with random samples drawn from a densely populated picture of plausible, alternative L-α-amino acids. In particular, the genetically encoded set is distributed across an exceptionally broad range of volumes, and distributed exceptionally evenly within a modest range of hydrophobicities. Surprisingly, range and evenness of charge (pKa) is exceptional only for the full amino acid structures, not for their sidechains - a result inconsistent with prior interpretations involving the role that amino acid sidechains play within protein sequences. In stark contrast, these remarkable features are far less clear when the prebiotically plausible subset of genetically encoded amino acids is compared with a much smaller pool of prebiotically plausible alternatives. By considering the nature of the "optimality theory" approach taken to derive these and prior insights, we suggest productive avenues for further research.
一系列先前的出版物已经报道了所有已知生命共享的一套遗传编码氨基酸的不寻常性质。这项工作使用大小、电荷和疏水性的定量度量(描述符)来比较遗传编码氨基酸的分布与合理替代方案的随机样本。结果表明,在 LUCA 时期建立的标准“氨基酸字母表”在三个关键理化性质上具有不寻常的均匀分布。然而,不同的出版物使用了略有不同的假设,包括所使用的精确描述符、考虑的合理替代分子集以及呈现结果的格式的变化。在这里,我们将这些发现整合到一个统一的框架中,以澄清不寻常的特征。我们发现,一般来说,与从合理的 L-α-氨基酸的密集图像中抽取的随机样本相比,整套 20 种遗传编码氨基酸的显著特征是稳健的。特别是,遗传编码集分布在体积异常广泛的范围内,在适中的疏水性范围内分布异常均匀。令人惊讶的是,电荷(pKa)的范围和均匀性仅对于完整的氨基酸结构是异常的,而不是其侧链 - 这一结果与涉及氨基酸侧链在蛋白质序列中所起作用的先前解释不一致。相比之下,当将遗传编码的前生物合理子集与更小的前生物合理替代方案池进行比较时,这些显著特征就不那么明显了。通过考虑用于得出这些和先前见解的“最优理论”方法的性质,我们提出了进一步研究的有成效的途径。