Kosinski Luke, Aviles Nathan, Gomez Kevin, Masel Joanna
Department of Molecular and Cellular Biology, University of Arizona, Tucson, AZ, USA.
Graduate Interdisciplinary Program in Statistics, University of Arizona, Tucson, AZ, USA.
Genome Biol Evol. 2022 Jun 7;14(6). doi: 10.1093/gbe/evac085.
Proteins are the workhorses of the cell, yet they carry great potential for harm via misfolding and aggregation. Despite the dangers, proteins are sometimes born de novo from non-coding DNA. Proteins are more likely to be born from non-coding regions that produce peptides that do little to no harm when translated than from regions that produce harmful peptides. To investigate which newborn proteins are most likely to "first, do no harm", we estimate fitnesses from an experiment that competed Escherichia coli lineages that each expressed a unique random peptide. A variety of peptide metrics significantly predict lineage fitness, but this predictive power stems from simple amino acid frequencies rather than the ordering of amino acids. Amino acids that are smaller and that promote intrinsic structural disorder have more benign fitness effects. We validate that the amino acids that indicate benign effects in random peptides expressed in E. coli also do so in an independent dataset of random N-terminal tags in which it is possible to control for expression level. The same amino acids are also enriched in young animal proteins.
蛋白质是细胞的主力军,但它们因错误折叠和聚集而具有极大的潜在危害。尽管存在这些危险,蛋白质有时仍会从非编码DNA从头诞生。与产生有害肽段的区域相比,蛋白质更有可能诞生于那些产生翻译后危害极小或无危害肽段的非编码区域。为了研究哪些新生蛋白质最有可能“首要的是不造成伤害”,我们通过一项实验来估计适应性,该实验让每个都表达一种独特随机肽段的大肠杆菌谱系相互竞争。多种肽段指标能显著预测谱系适应性,但这种预测能力源于简单的氨基酸频率,而非氨基酸的排列顺序。体积较小且促进内在结构无序的氨基酸具有更良性的适应性效应。我们验证了在大肠杆菌中表达的随机肽段中显示良性效应的氨基酸,在一个独立的随机N端标签数据集中也有同样的作用,在该数据集中可以控制表达水平。相同的氨基酸在幼年动物蛋白质中也更为富集。