Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Univ. Paris-Sud, Université Paris-Saclay, Gif-sur-Yvette, Cedex, 91198, France.
Hospital del Mar Research Institute, Barcelona, Spain.
Genome Biol. 2024 Oct 14;25(1):268. doi: 10.1186/s13059-024-03403-7.
Pervasive translation is a widespread phenomenon that plays a critical role in the emergence of novel microproteins, but the diversity of translation patterns contributing to their generation remains unclear. Based on 54 ribosome profiling (Ribo-Seq) datasets, we investigated the yeast Ribo-Seq landscape using a representation framework that allows the comprehensive inventory and classification of the entire diversity of Ribo-Seq signals, including non-canonical ones.
We show that if coding regions occupy specific areas of the Ribo-Seq landscape, noncoding regions encompass a wide diversity of Ribo-Seq signals and, conversely, populate the entire landscape. Our results show that pervasive translation can, nevertheless, be associated with high specificity, with 1055 noncoding ORFs exhibiting canonical Ribo-Seq signals. Using mass spectrometry under standard conditions or proteasome inhibition with an in-house analysis protocol, we report 239 microproteins originating from noncoding ORFs that display canonical but also non-canonical Ribo-Seq signals. Each condition yields dozens of additional microprotein candidates with comparable translation properties, suggesting a larger population of volatile microproteins that are challenging to detect. Our findings suggest that non-canonical translation signals may harbor valuable information and underscore the significance of considering them in proteogenomic studies. Finally, we show that the translation outcome of a noncoding ORF is primarily determined by the initiating codon and the codon distribution in its two alternative frames, rather than features indicative of functionality.
Our results enable us to propose a topology of a species' Ribo-Seq landscape, opening the way to comparative analyses of this translation landscape under different conditions.
普遍翻译是一种广泛存在的现象,在新的微蛋白出现中起着至关重要的作用,但促成其产生的翻译模式多样性尚不清楚。基于 54 个核糖体分析(Ribo-Seq)数据集,我们使用一种表示框架研究了酵母的 Ribo-Seq 图谱,该框架允许全面盘点和分类核糖体分析信号的全部多样性,包括非规范信号。
我们表明,如果编码区占据核糖体分析图谱的特定区域,非编码区则包含广泛的核糖体分析信号,反之亦然,遍布整个图谱。我们的结果表明,普遍翻译可能具有高度特异性,1055 个非编码 ORF 表现出规范的核糖体分析信号。使用标准条件下的质谱或使用内部分析方案抑制蛋白酶体,我们报告了 239 个源自非编码 ORF 的微蛋白,这些微蛋白显示出规范但也具有非规范的核糖体分析信号。每种条件都会产生数十种具有类似翻译特性的额外微蛋白候选物,这表明存在更大数量的易挥发微蛋白,难以检测。我们的发现表明,非规范翻译信号可能蕴藏着有价值的信息,并强调了在蛋白质基因组学研究中考虑这些信号的重要性。最后,我们表明,非编码 ORF 的翻译结果主要取决于起始密码子和其两个备用框架中的密码子分布,而不是指示功能的特征。
我们的结果使我们能够提出物种核糖体分析图谱的拓扑结构,为在不同条件下进行这种翻译图谱的比较分析开辟了道路。