Suppr超能文献

氨基酸序列编码蛋白质丰度,其由降低合成成本时的蛋白质稳定性所塑造。

Amino acid sequence encodes protein abundance shaped by protein stability at reduced synthesis cost.

作者信息

Buric Filip, Viknander Sandra, Fu Xiaozhi, Lemke Oliver, Carmona Oriol Gracia, Zrimec Jan, Szyrwiel Lukasz, Mülleder Michael, Ralser Markus, Zelezniak Aleksej

机构信息

Department of Biology and Biological Engineering, Chalmers University of Technology, Gothenburg, Sweden.

Department of Biochemistry, Charité - Universitätsmedizin Berlin, Berlin, Germany.

出版信息

Protein Sci. 2025 Jan;34(1):e5239. doi: 10.1002/pro.5239.

Abstract

Understanding what drives protein abundance is essential to biology, medicine, and biotechnology. Driven by evolutionary selection, an amino acid sequence is tailored to meet the required abundance of a proteome, underscoring the intricate relationship between sequence and functional demand. Yet, the specific role of amino acid sequences in determining proteome abundance remains elusive. Here we show that the amino acid sequence alone encodes over half of protein abundance variation across all domains of life, ranging from bacteria to mouse and human. With an attempt to go beyond predictions, we trained a manageable-size Transformer model to interpret latent factors predictive of protein abundances. Intuitively, the model's attention focused on the protein's structural features linked to stability and metabolic costs related to protein synthesis. To probe these relationships, we introduce MGEM (Mutation Guided by an Embedded Manifold), a methodology for guiding protein abundance through sequence modifications. We find that mutations which increase predicted abundance have significantly altered protein polarity and hydrophobicity, underscoring a connection between protein structural features and abundance. Through molecular dynamics simulations we revealed that abundance-enhancing mutations possibly contribute to protein thermostability by increasing rigidity, which occurs at a lower synthesis cost.

摘要

了解驱动蛋白质丰度的因素对于生物学、医学和生物技术至关重要。在进化选择的驱动下,氨基酸序列经过调整以满足蛋白质组所需的丰度,这突出了序列与功能需求之间的复杂关系。然而,氨基酸序列在决定蛋白质组丰度方面的具体作用仍然难以捉摸。在这里,我们表明,仅氨基酸序列就编码了从细菌到小鼠和人类等所有生命领域中超过一半的蛋白质丰度变化。为了超越预测,我们训练了一个规模可控的Transformer模型来解释预测蛋白质丰度的潜在因素。直观地说,该模型的注意力集中在与稳定性相关的蛋白质结构特征以及与蛋白质合成相关的代谢成本上。为了探究这些关系,我们引入了MGEM(嵌入流形引导的突变),一种通过序列修饰来引导蛋白质丰度的方法。我们发现,增加预测丰度的突变显著改变了蛋白质的极性和疏水性,这突出了蛋白质结构特征与丰度之间的联系。通过分子动力学模拟,我们揭示了增加丰度的突变可能通过增加刚性来提高蛋白质的热稳定性,而这是以较低的合成成本实现的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1a2a/11635393/67e1de7b0360/PRO-34-e5239-g002.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验