Suppr超能文献

使用蛋白质语言模型解码模式生物中的功能蛋白质组信息。

Decoding functional proteome information in model organisms using protein language models.

作者信息

Barrios-Núñez Israel, Martínez-Redondo Gemma I, Medina-Burgos Patricia, Cases Ildefonso, Fernández Rosa, Rojas Ana M

机构信息

Computational Biology and Bioinformatics Group, Andalusian Center for Developmental Biology (CABD-CSIC), 41013 Sevilla, Spain.

Metazoa Phylogenomics Lab, Institute of Evolutionary Biology (CSIC-UPF), 08003 Barcelona, Spain.

出版信息

NAR Genom Bioinform. 2024 Jul 2;6(3):lqae078. doi: 10.1093/nargab/lqae078. eCollection 2024 Sep.

Abstract

Protein language models have been tested and proved to be reliable when used on curated datasets but have not yet been applied to full proteomes. Accordingly, we tested how two different machine learning-based methods performed when decoding functional information from the proteomes of selected model organisms. We found that protein language models are more precise and informative than deep learning methods for all the species tested and across the three gene ontologies studied, and that they better recover functional information from transcriptomic experiments. The results obtained indicate that these language models are likely to be suitable for large-scale annotation and downstream analyses, and we recommend a guide for their use.

摘要

蛋白质语言模型在经过整理的数据集上进行测试时已被证明是可靠的,但尚未应用于完整蛋白质组。因此,我们测试了两种基于机器学习的不同方法在从选定模式生物的蛋白质组中解码功能信息时的表现。我们发现,对于所有测试物种以及所研究的三个基因本体,蛋白质语言模型比深度学习方法更精确且信息更丰富,并且它们能更好地从转录组实验中恢复功能信息。所获得的结果表明,这些语言模型可能适用于大规模注释和下游分析,并且我们推荐了一份使用指南。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8169/11217674/afa1fcd95dfd/lqae078figgra1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验