• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过机器学习方法进行蛋白质丰度预测

Protein Abundance Prediction Through Machine Learning Methods.

作者信息

Ferreira Mauricio, Ventorim Rafaela, Almeida Eduardo, Silveira Sabrina, Silveira Wendel

机构信息

Department of Microbiology, Universidade Federal de Viçosa, Viçosa, MG 36570-900, Brazil. Electronic address: https://twitter.com/@mauriciomyces.

Department of Microbiology, Universidade Federal de Viçosa, Viçosa, MG 36570-900, Brazil.

出版信息

J Mol Biol. 2021 Nov 5;433(22):167267. doi: 10.1016/j.jmb.2021.167267. Epub 2021 Sep 23.

DOI:10.1016/j.jmb.2021.167267
PMID:34563548
Abstract

Proteins are responsible for most physiological processes, and their abundance provides crucial information for systems biology research. However, absolute protein quantification, as determined by mass spectrometry, still has limitations in capturing the protein pool. Protein abundance is impacted by translation kinetics, which rely on features of codons. In this study, we evaluated the effect of codon usage bias of genes on protein abundance. Notably, we observed differences regarding codon usage patterns between genes coding for highly abundant proteins and genes coding for less abundant proteins. Analysis of synonymous codon usage and evolutionary selection showed a clear split between the two groups. Our machine learning models predicted protein abundances from codon usage metrics with remarkable accuracy, achieving strong correlation with experimental data. Upon integration of the predicted protein abundance in enzyme-constrained genome-scale metabolic models, the simulated phenotypes closely matched experimental data, which demonstrates that our predictive models are valuable tools for systems metabolic engineering approaches.

摘要

蛋白质负责大多数生理过程,其丰度为系统生物学研究提供了关键信息。然而,通过质谱法确定的绝对蛋白质定量在捕捉蛋白质库方面仍存在局限性。蛋白质丰度受翻译动力学影响,而翻译动力学依赖于密码子的特征。在本研究中,我们评估了基因密码子使用偏好对蛋白质丰度的影响。值得注意的是,我们观察到编码高丰度蛋白质的基因和编码低丰度蛋白质的基因在密码子使用模式上存在差异。同义密码子使用和进化选择分析表明两组之间存在明显的分化。我们的机器学习模型根据密码子使用指标预测蛋白质丰度,具有显著的准确性,与实验数据具有很强的相关性。将预测的蛋白质丰度整合到酶约束的基因组规模代谢模型中后,模拟表型与实验数据紧密匹配,这表明我们的预测模型是系统代谢工程方法的有价值工具。

相似文献

1
Protein Abundance Prediction Through Machine Learning Methods.通过机器学习方法进行蛋白质丰度预测
J Mol Biol. 2021 Nov 5;433(22):167267. doi: 10.1016/j.jmb.2021.167267. Epub 2021 Sep 23.
2
Analysis of computational codon usage models and their association with translationally slow codons.计算密码子使用模型分析及其与翻译缓慢密码子的关联。
PLoS One. 2020 Apr 30;15(4):e0232003. doi: 10.1371/journal.pone.0232003. eCollection 2020.
3
Coevolution of codon usage and transfer RNA abundance.密码子使用与转运RNA丰度的共同进化
Nature. 1987;325(6106):728-30. doi: 10.1038/325728a0.
4
The combined influence of codon composition and tRNA copy number regulates translational efficiency by influencing synonymous nucleotide substitution.密码子组成和 tRNA 拷贝数的综合影响通过影响同义核苷酸替换来调节翻译效率。
Gene. 2020 Jun 30;745:144640. doi: 10.1016/j.gene.2020.144640. Epub 2020 Apr 1.
5
Insights into the evolutionary forces that shape the codon usage in the viral genome segments encoding intrinsically disordered protein regions.深入了解塑造编码内在无序蛋白区域的病毒基因组片段中密码子使用偏好的进化力量。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab145.
6
Growth-dependent Gene Expression Variation Influences the Strength of Codon Usage Biases.生长依赖性基因表达变化影响密码子使用偏好的强度。
Mol Biol Evol. 2023 Sep 1;40(9). doi: 10.1093/molbev/msad189.
7
Quantifying shifts in natural selection on codon usage between protein regions: a population genetics approach.量化蛋白质区域之间密码子使用自然选择的变化:一种群体遗传学方法。
BMC Genomics. 2022 May 30;23(1):408. doi: 10.1186/s12864-022-08635-0.
8
Conserved codon composition of ribosomal protein coding genes in Escherichia coli, Mycobacterium tuberculosis and Saccharomyces cerevisiae: lessons from supervised machine learning in functional genomics.大肠杆菌、结核分枝杆菌和酿酒酵母中核糖体蛋白编码基因的保守密码子组成:功能基因组学中监督机器学习的经验教训
Nucleic Acids Res. 2002 Jun 1;30(11):2599-607. doi: 10.1093/nar/30.11.2599.
9
Modelling the efficiency of codon-tRNA interactions based on codon usage bias.基于密码子使用偏好性对密码子 - tRNA相互作用效率进行建模。
DNA Res. 2014 Oct;21(5):511-26. doi: 10.1093/dnares/dsu017. Epub 2014 Jun 6.
10
Codon usage patterns in Escherichia coli, Bacillus subtilis, Saccharomyces cerevisiae, Schizosaccharomyces pombe, Drosophila melanogaster and Homo sapiens; a review of the considerable within-species diversity.大肠杆菌、枯草芽孢杆菌、酿酒酵母、粟酒裂殖酵母、黑腹果蝇和智人的密码子使用模式;对物种内显著多样性的综述
Nucleic Acids Res. 1988 Sep 12;16(17):8207-11. doi: 10.1093/nar/16.17.8207.

引用本文的文献

1
Inferring protein from transcript abundances using convolutional neural networks.使用卷积神经网络从转录本丰度推断蛋白质。
BioData Min. 2025 Feb 27;18(1):18. doi: 10.1186/s13040-025-00434-z.
2
A Multi-Omics, Machine Learning-Aware, Genome-Wide Metabolic Model of Bacillus Subtilis Refines the Gene Expression and Cell Growth Prediction.多组学、机器学习感知的枯草芽孢杆菌全基因组代谢模型改进了基因表达和细胞生长预测。
Adv Sci (Weinh). 2024 Nov;11(42):e2408705. doi: 10.1002/advs.202408705. Epub 2024 Sep 17.
3
Predicting single-cell cellular responses to perturbations using cycle consistency learning.
使用循环一致性学习预测单细胞对扰动的细胞反应。
Bioinformatics. 2024 Jun 28;40(Suppl 1):i462-i470. doi: 10.1093/bioinformatics/btae248.
4
Automated identification of protein expression intensity and classification of protein cellular locations in mouse brain regions from immunofluorescence images.从免疫荧光图像中自动识别小鼠脑区的蛋白质表达强度和蛋白质细胞定位分类。
Med Biol Eng Comput. 2024 Apr;62(4):1105-1119. doi: 10.1007/s11517-023-02985-x. Epub 2023 Dec 27.
5
PARROT: Prediction of enzyme abundances using protein-constrained metabolic models.利用蛋白约束代谢模型预测酶丰度。
PLoS Comput Biol. 2023 Oct 19;19(10):e1011549. doi: 10.1371/journal.pcbi.1011549. eCollection 2023 Oct.
6
Druggability of Targets for Diagnostic Radiopharmaceuticals.诊断性放射性药物靶点的成药潜力
ACS Pharmacol Transl Sci. 2023 Jul 12;6(8):1107-1119. doi: 10.1021/acsptsci.3c00081. eCollection 2023 Aug 11.
7
Ultradeep characterisation of translational sequence determinants refutes rare-codon hypothesis and unveils quadruplet base pairing of initiator tRNA and transcript.对翻译序列决定因素的超深度分析否定了稀有密码子假说,并揭示了起始 tRNA 和转录物的四联体碱基配对。
Nucleic Acids Res. 2023 Mar 21;51(5):2377-2396. doi: 10.1093/nar/gkad040.
8
Bioinformatic Assessment of Factors Affecting the Correlation between Protein Abundance and Elongation Efficiency in Prokaryotes.原核生物中蛋白丰度与延伸效率相关性影响因素的生物信息学评估。
Int J Mol Sci. 2022 Oct 9;23(19):11996. doi: 10.3390/ijms231911996.
9
Learning the Regulatory Code of Gene Expression.学习基因表达的调控密码。
Front Mol Biosci. 2021 Jun 10;8:673363. doi: 10.3389/fmolb.2021.673363. eCollection 2021.
10
Multiomic Big Data Analysis Challenges: Increasing Confidence in the Interpretation of Artificial Intelligence Assessments.多组学大数据分析挑战:提高对人工智能评估解读的信心。
Anal Chem. 2021 Jun 8;93(22):7763-7773. doi: 10.1021/acs.analchem.0c04850. Epub 2021 May 24.