Suppr超能文献

RNA语言模型可预测能改善RNA功能的突变。

RNA language models predict mutations that improve RNA function.

作者信息

Shulgina Yekaterina, Trinidad Marena I, Langeberg Conner J, Nisonoff Hunter, Chithrananda Seyone, Skopintsev Petr, Nissley Amos J, Patel Jaymin, Boger Ron S, Shi Honglue, Yoon Peter H, Doherty Erin E, Pande Tara, Iyer Aditya M, Doudna Jennifer A, Cate Jamie H D

机构信息

Innovative Genomics Institute, University of California, Berkeley, CA, USA.

Department of Molecular and Cell Biology, University of California, Berkeley, CA, USA.

出版信息

Nat Commun. 2024 Dec 5;15(1):10627. doi: 10.1038/s41467-024-54812-y.

Abstract

Structured RNA lies at the heart of many central biological processes, from gene expression to catalysis. RNA structure prediction is not yet possible due to a lack of high-quality reference data associated with organismal phenotypes that could inform RNA function. We present GARNET (Gtdb Acquired RNa with Environmental Temperatures), a new database for RNA structural and functional analysis anchored to the Genome Taxonomy Database (GTDB). GARNET links RNA sequences to experimental and predicted optimal growth temperatures of GTDB reference organisms. Using GARNET, we develop sequence- and structure-aware RNA generative models, with overlapping triplet tokenization providing optimal encoding for a GPT-like model. Leveraging hyperthermophilic RNAs in GARNET and these RNA generative models, we identify mutations in ribosomal RNA that confer increased thermostability to the Escherichia coli ribosome. The GTDB-derived data and deep learning models presented here provide a foundation for understanding the connections between RNA sequence, structure, and function.

摘要

结构化RNA处于许多核心生物过程的核心,从基因表达到催化作用。由于缺乏与生物体表型相关的高质量参考数据来阐明RNA功能,目前还无法进行RNA结构预测。我们展示了GARNET(基于环境温度获取的基因组分类数据库RNA),这是一个锚定在基因组分类数据库(GTDB)上的用于RNA结构和功能分析的新数据库。GARNET将RNA序列与GTDB参考生物的实验和预测最佳生长温度联系起来。利用GARNET,我们开发了序列和结构感知的RNA生成模型,重叠三联体分词为类似GPT的模型提供了最佳编码。利用GARNET中的超嗜热RNA和这些RNA生成模型,我们确定了核糖体RNA中的突变,这些突变赋予大肠杆菌核糖体更高的热稳定性。本文介绍的源自GTDB的数据和深度学习模型为理解RNA序列、结构和功能之间的联系奠定了基础。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/70dd/11621547/8e7871188ca2/41467_2024_54812_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验