• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

RNA序列分析全景:任务类型、数据库、数据集、词嵌入方法及语言模型的全面综述

RNA sequence analysis landscape: A comprehensive review of task types, databases, datasets, word embedding methods, and language models.

作者信息

Asim Muhammad Nabeel, Ibrahim Muhammad Ali, Asif Tayyaba, Dengel Andreas

机构信息

German Research Center for Artificial Intelligence GmbH, Kaiserslautern, 67663, Germany.

Department of Computer Science, Rhineland-Palatinate Technical University of Kaiserslautern-Landau, Kaiserslautern, 67663, Germany.

出版信息

Heliyon. 2025 Jan 6;11(2):e41488. doi: 10.1016/j.heliyon.2024.e41488. eCollection 2025 Jan 30.

DOI:10.1016/j.heliyon.2024.e41488
PMID:39897847
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11783440/
Abstract

Deciphering information of RNA sequences reveals their diverse roles in living organisms, including gene regulation and protein synthesis. Aberrations in RNA sequence such as dysregulation and mutations can drive a diverse spectrum of diseases including cancers, genetic disorders, and neurodegenerative conditions. Furthermore, researchers are harnessing RNA's therapeutic potential for transforming traditional treatment paradigms into personalized therapies through the development of RNA-based drugs and gene therapies. To gain insights of biological functions and to detect diseases at early stages and develop potent therapeutics, researchers are performing diverse types RNA sequence analysis tasks. RNA sequence analysis through conventional wet-lab methods is expensive, time-consuming and error prone. To enable large-scale RNA sequence analysis, empowerment of wet-lab experimental methods with Artificial Intelligence (AI) applications necessitates scientists to have a comprehensive knowledge of both DNA and AI fields. While molecular biologists encounter challenges in understanding AI methods, computer scientists often lack basic foundations of RNA sequence analysis tasks. Considering the absence of a comprehensive literature that bridges this research gap and promotes the development of AI-driven RNA sequence analysis applications, the contributions of this manuscript are manifold: It equips AI researchers with biological foundations of 47 distinct RNA sequence analysis tasks. It sets a stage for development of benchmark datasets related to 47 distinct RNA sequence analysis tasks by facilitating cruxes of 64 different biological databases. It presents word embeddings and language models applications across 47 distinct RNA sequence analysis tasks. It streamlines the development of new predictors by providing a comprehensive survey of 58 word embeddings and 70 language models based predictive pipelines performance values as well as top performing traditional sequence encoding based predictors and their performances across 47 RNA sequence analysis tasks.

摘要

解读RNA序列信息揭示了它们在生物体中的多种作用,包括基因调控和蛋白质合成。RNA序列的异常,如失调和突变,可引发包括癌症、遗传疾病和神经退行性疾病在内的多种疾病。此外,研究人员正在利用RNA的治疗潜力,通过开发基于RNA的药物和基因疗法,将传统治疗模式转变为个性化疗法。为了深入了解生物学功能、早期检测疾病并开发有效的治疗方法,研究人员正在执行各种类型的RNA序列分析任务。通过传统的湿实验室方法进行RNA序列分析既昂贵又耗时,而且容易出错。为了实现大规模RNA序列分析,用人工智能(AI)应用增强湿实验室实验方法,要求科学家对DNA和AI领域都有全面的了解。虽然分子生物学家在理解AI方法时遇到挑战,但计算机科学家往往缺乏RNA序列分析任务的基础知识。考虑到缺乏一篇全面的文献来填补这一研究空白并促进人工智能驱动的RNA序列分析应用的发展,本文稿的贡献是多方面的:它为人工智能研究人员提供了47种不同RNA序列分析任务的生物学基础。它通过促进64个不同生物数据库的关键内容,为与47种不同RNA序列分析任务相关的基准数据集的开发奠定了基础。它展示了跨47种不同RNA序列分析任务的词嵌入和语言模型应用。它通过全面调查58种词嵌入和70种基于语言模型的预测管道性能值,以及表现最佳的基于传统序列编码的预测器及其在47种RNA序列分析任务中的性能,简化了新预测器的开发。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45c8/11783440/d3b7ea2080a9/gr009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45c8/11783440/33e63ecbb2d6/gr001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45c8/11783440/2564b187df70/gr002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45c8/11783440/29884afb1215/gr003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45c8/11783440/ec1fee585608/gr004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45c8/11783440/5861fabb547f/gr005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45c8/11783440/2a1465be059a/gr006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45c8/11783440/cc958f66bd07/gr007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45c8/11783440/56abf955a8c6/gr008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45c8/11783440/d3b7ea2080a9/gr009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45c8/11783440/33e63ecbb2d6/gr001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45c8/11783440/2564b187df70/gr002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45c8/11783440/29884afb1215/gr003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45c8/11783440/ec1fee585608/gr004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45c8/11783440/5861fabb547f/gr005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45c8/11783440/2a1465be059a/gr006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45c8/11783440/cc958f66bd07/gr007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45c8/11783440/56abf955a8c6/gr008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45c8/11783440/d3b7ea2080a9/gr009.jpg

相似文献

1
RNA sequence analysis landscape: A comprehensive review of task types, databases, datasets, word embedding methods, and language models.RNA序列分析全景:任务类型、数据库、数据集、词嵌入方法及语言模型的全面综述
Heliyon. 2025 Jan 6;11(2):e41488. doi: 10.1016/j.heliyon.2024.e41488. eCollection 2025 Jan 30.
2
DNA sequence analysis landscape: a comprehensive review of DNA sequence analysis task types, databases, datasets, word embedding methods, and language models.DNA序列分析全景:对DNA序列分析任务类型、数据库、数据集、词嵌入方法和语言模型的全面综述。
Front Med (Lausanne). 2025 Apr 8;12:1503229. doi: 10.3389/fmed.2025.1503229. eCollection 2025.
3
Peptide classification landscape: An in-depth systematic literature review on peptide types, databases, datasets, predictors architectures and performance.肽分类全景:关于肽类型、数据库、数据集、预测器架构及性能的深入系统文献综述
Comput Biol Med. 2025 Apr;188:109821. doi: 10.1016/j.compbiomed.2025.109821. Epub 2025 Feb 22.
4
Transitioning from wet lab to artificial intelligence: a systematic review of AI predictors in CRISPR.从湿实验室到人工智能的转变:对CRISPR中人工智能预测因子的系统综述
J Transl Med. 2025 Feb 4;23(1):153. doi: 10.1186/s12967-024-06013-w.
5
ATOMMIC: An Advanced Toolbox for Multitask Medical Imaging Consistency to facilitate Artificial Intelligence applications from acquisition to analysis in Magnetic Resonance Imaging.ATOMMIC:一个高级的多任务医学成像一致性工具箱,旨在促进磁共振成像从采集到分析的人工智能应用。
Comput Methods Programs Biomed. 2024 Nov;256:108377. doi: 10.1016/j.cmpb.2024.108377. Epub 2024 Aug 22.
6
Artificial intelligence to revolutionize IBD clinical trials: a comprehensive review.人工智能将彻底改变炎症性肠病临床试验:全面综述。
Therap Adv Gastroenterol. 2025 Feb 23;18:17562848251321915. doi: 10.1177/17562848251321915. eCollection 2025.
7
Data stewardship and curation practices in AI-based genomics and automated microscopy image analysis for high-throughput screening studies: promoting robust and ethical AI applications.基于人工智能的基因组学和用于高通量筛选研究的自动显微镜图像分析中的数据管理与整理实践:推动可靠且符合伦理的人工智能应用。
Hum Genomics. 2025 Feb 23;19(1):16. doi: 10.1186/s40246-025-00716-x.
8
Integrating Artificial Intelligence in the Diagnosis and Management of Metabolic Syndrome: A Comprehensive Review.将人工智能整合到代谢综合征的诊断和管理中:一项全面综述。
Diabetes Metab Res Rev. 2025 May;41(4):e70039. doi: 10.1002/dmrr.70039.
9
Artificial Intelligence and Machine Learning in Pharmacological Research: Bridging the Gap Between Data and Drug Discovery.药理学研究中的人工智能与机器学习:弥合数据与药物发现之间的差距
Cureus. 2023 Aug 30;15(8):e44359. doi: 10.7759/cureus.44359. eCollection 2023 Aug.
10
Artificial Intelligence Applications to Measure Food and Nutrient Intakes: Scoping Review.人工智能在测量食物和营养素摄入量中的应用:范围综述。
J Med Internet Res. 2024 Nov 28;26:e54557. doi: 10.2196/54557.

引用本文的文献

1
Fungi-Kcr: a language model for predicting lysine crotonylation in pathogenic fungal proteins.真菌Kcr:一种用于预测致病真菌蛋白质中赖氨酸巴豆酰化的语言模型。
Front Cell Infect Microbiol. 2025 Jul 15;15:1615443. doi: 10.3389/fcimb.2025.1615443. eCollection 2025.

本文引用的文献

1
Predicting RNA-seq coverage from DNA sequence as a unifying model of gene regulation.将DNA序列预测RNA测序覆盖度作为基因调控的统一模型。
Nat Genet. 2025 Apr;57(4):949-961. doi: 10.1038/s41588-024-02053-6. Epub 2025 Jan 8.
2
RNA language models predict mutations that improve RNA function.RNA语言模型可预测能改善RNA功能的突变。
Nat Commun. 2024 Dec 5;15(1):10627. doi: 10.1038/s41467-024-54812-y.
3
GenerRNA: A generative pre-trained language model for de novo RNA design.GenerRNA:一种用于从头设计 RNA 的生成式预训练语言模型。
PLoS One. 2024 Oct 1;19(10):e0310814. doi: 10.1371/journal.pone.0310814. eCollection 2024.
4
Survival prediction landscape: an in-depth systematic literature review on activities, methods, tools, diseases, and databases.生存预测全景:关于活动、方法、工具、疾病和数据库的深入系统文献综述
Front Artif Intell. 2024 Jul 3;7:1428501. doi: 10.3389/frai.2024.1428501. eCollection 2024.
5
Deep learning for predicting 16S rRNA gene copy number.深度学习预测 16S rRNA 基因拷贝数。
Sci Rep. 2024 Jun 20;14(1):14282. doi: 10.1038/s41598-024-64658-5.
6
Big data and deep learning for RNA biology.大数据和深度学习在 RNA 生物学中的应用。
Exp Mol Med. 2024 Jun;56(6):1293-1321. doi: 10.1038/s12276-024-01243-w. Epub 2024 Jun 14.
7
RNA-Seq Data Analysis: A Practical Guide for Model and Non-Model Organisms.RNA-Seq 数据分析:模型和非模型生物的实用指南。
Curr Protoc. 2024 May;4(5):e1054. doi: 10.1002/cpz1.1054.
8
scMNMF: a novel method for single-cell multi-omics clustering based on matrix factorization.scMNMF:一种基于矩阵分解的单细胞多组学聚类新方法。
Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae228.
9
AE-RW: Predicting miRNA-disease associations by using autoencoder and random walk on miRNA-gene-disease heterogeneous network.AE-RW:基于 miRNA-基因-疾病异质网络的自动编码器和随机游走预测 miRNA-疾病关联。
Comput Biol Chem. 2024 Jun;110:108085. doi: 10.1016/j.compbiolchem.2024.108085. Epub 2024 May 8.
10
Similarity-guided graph contrastive learning for lncRNA-disease association prediction.用于lncRNA-疾病关联预测的相似性引导图对比学习
J Mol Biol. 2025 Mar 15;437(6):168609. doi: 10.1016/j.jmb.2024.168609. Epub 2024 May 18.