• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

PseUdeep:使用深度学习算法进行RNA假尿苷位点识别

PseUdeep: RNA Pseudouridine Site Identification with Deep Learning Algorithm.

作者信息

Zhuang Jujuan, Liu Danyang, Lin Meng, Qiu Wenjing, Liu Jinyang, Chen Size

机构信息

College of Science, Dalian Maritime University, Dalian, China.

Electrical and Information Engineering, Anhui University of Technology, Anhui, China.

出版信息

Front Genet. 2021 Nov 18;12:773882. doi: 10.3389/fgene.2021.773882. eCollection 2021.

DOI:10.3389/fgene.2021.773882
PMID:34868261
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8637112/
Abstract

Pseudouridine (Ψ) is a common ribonucleotide modification that plays a significant role in many biological processes. The identification of Ψ modification sites is of great significance for disease mechanism and biological processes research in which machine learning algorithms are desirable as the lab exploratory techniques are expensive and time-consuming. In this work, we propose a deep learning framework, called PseUdeep, to identify Ψ sites of three species: , , and . In this method, three encoding methods are used to extract the features of RNA sequences, that is, one-hot encoding, K-tuple nucleotide frequency pattern, and position-specific nucleotide composition. The three feature matrices are convoluted twice and fed into the capsule neural network and bidirectional gated recurrent unit network with a self-attention mechanism for classification. Compared with other state-of-the-art methods, our model gets the highest accuracy of the prediction on the independent testing data set S-200; the accuracy improves 12.38%, and on the independent testing data set H-200, the accuracy improves 0.68%. Moreover, the dimensions of the features we derive from the RNA sequences are only 109,109, and 119 in , , and , which is much smaller than those used in the traditional algorithms. On evaluation via tenfold cross-validation and two independent testing data sets, PseUdeep outperforms the best traditional machine learning model available. PseUdeep source code and data sets are available at https://github.com/dan111262/PseUdeep.

摘要

假尿苷(Ψ)是一种常见的核糖核苷酸修饰,在许多生物过程中发挥着重要作用。Ψ修饰位点的识别对于疾病机制和生物过程研究具有重要意义,而机器学习算法是理想的选择,因为实验室探索技术既昂贵又耗时。在这项工作中,我们提出了一种名为PseUdeep的深度学习框架,用于识别三种物种的Ψ位点:[此处原文缺失物种名称]、[此处原文缺失物种名称]和[此处原文缺失物种名称]。在这种方法中,使用了三种编码方法来提取RNA序列的特征,即独热编码、K元核苷酸频率模式和位置特异性核苷酸组成。将这三个特征矩阵进行两次卷积,然后输入到具有自注意力机制的胶囊神经网络和双向门控循环单元网络中进行分类。与其他现有最先进的方法相比,我们的模型在独立测试数据集S - 200上的预测准确率最高;准确率提高了12.38%,在独立测试数据集H - 200上,准确率提高了0.68%。此外,我们从RNA序列中得出的特征维度在[此处原文缺失物种名称]、[此处原文缺失物种名称]和[此处原文缺失物种名称]中分别仅为109、109和119,远小于传统算法中使用的维度。通过十折交叉验证和两个独立测试数据集进行评估时发现PseUdeep优于现有的最佳传统机器学习模型。PseUdeep的源代码和数据集可在https://github.com/dan111262/PseUdeep获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7314/8637112/c670ca4a7f50/fgene-12-773882-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7314/8637112/a91002631376/fgene-12-773882-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7314/8637112/707070980191/fgene-12-773882-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7314/8637112/ab8effa23871/fgene-12-773882-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7314/8637112/dc78098e7fbb/fgene-12-773882-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7314/8637112/c670ca4a7f50/fgene-12-773882-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7314/8637112/a91002631376/fgene-12-773882-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7314/8637112/707070980191/fgene-12-773882-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7314/8637112/ab8effa23871/fgene-12-773882-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7314/8637112/dc78098e7fbb/fgene-12-773882-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7314/8637112/c670ca4a7f50/fgene-12-773882-g005.jpg

相似文献

1
PseUdeep: RNA Pseudouridine Site Identification with Deep Learning Algorithm.PseUdeep:使用深度学习算法进行RNA假尿苷位点识别
Front Genet. 2021 Nov 18;12:773882. doi: 10.3389/fgene.2021.773882. eCollection 2021.
2
MU-PseUDeep: A deep learning method for prediction of pseudouridine sites.MU-PseUDeep:一种预测假尿苷位点的深度学习方法。
Comput Struct Biotechnol J. 2020 Jul 15;18:1877-1883. doi: 10.1016/j.csbj.2020.07.010. eCollection 2020.
3
PseU-ST: A new stacked ensemble-learning method for identifying RNA pseudouridine sites.PseU-ST:一种用于识别RNA假尿苷位点的新型堆叠集成学习方法。
Front Genet. 2023 Jan 19;14:1121694. doi: 10.3389/fgene.2023.1121694. eCollection 2023.
4
PseUI: Pseudouridine sites identification based on RNA sequence information.PseUI:基于 RNA 序列信息的假尿嘧啶核苷位点鉴定。
BMC Bioinformatics. 2018 Aug 29;19(1):306. doi: 10.1186/s12859-018-2321-0.
5
DeepMRMP: A new predictor for multiple types of RNA modification sites using deep learning.DeepMRMP:一种使用深度学习预测多种 RNA 修饰位点的新方法。
Math Biosci Eng. 2019 Jul 4;16(6):6231-6241. doi: 10.3934/mbe.2019310.
6
Is There Any Sequence Feature in the RNA Pseudouridine Modification Prediction Problem?RNA假尿苷修饰预测问题中是否存在任何序列特征?
Mol Ther Nucleic Acids. 2020 Mar 6;19:293-303. doi: 10.1016/j.omtn.2019.11.014. Epub 2019 Nov 21.
7
PseU-KeMRF: A Novel Method for Identifying RNA Pseudouridine Sites.PseU-KeMRF:一种识别 RNA 假尿嘧啶位点的新方法。
IEEE/ACM Trans Comput Biol Bioinform. 2024 Sep-Oct;21(5):1423-1435. doi: 10.1109/TCBB.2024.3389094. Epub 2024 Oct 9.
8
Porpoise: a new approach for accurate prediction of RNA pseudouridine sites.海豚:一种准确预测 RNA 假尿嘧啶位点的新方法。
Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab245.
9
im5C-DSCGA: A Proposed Hybrid Framework Based on Improved DenseNet and Attention Mechanisms for Identifying 5-methylcytosine Sites in Human RNA.im5C-DSCGA:一种基于改进的 DenseNet 和注意力机制的混合框架,用于识别人类 RNA 中的 5-甲基胞嘧啶位点。
Front Biosci (Landmark Ed). 2023 Dec 26;28(12):346. doi: 10.31083/j.fbl2812346.
10
Penguin: A tool for predicting pseudouridine sites in direct RNA nanopore sequencing data.Penguin:一种用于预测直接 RNA 纳米孔测序数据中假尿嘧啶位点的工具。
Methods. 2022 Jul;203:478-487. doi: 10.1016/j.ymeth.2022.02.005. Epub 2022 Feb 16.

引用本文的文献

1
RSCNN-PseU: random searching-based convolutional neural network model for identifying RNA pseudouridine.RSCNN-PseU:基于随机搜索的用于识别RNA假尿苷的卷积神经网络模型。
Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf417.
2
GBMPhos: A Gating Mechanism and Bi-GRU-Based Method for Identifying Phosphorylation Sites of SARS-CoV-2 Infection.GBMPhos:一种基于门控机制和双向门控循环单元的新冠病毒感染磷酸化位点识别方法。
Biology (Basel). 2024 Oct 6;13(10):798. doi: 10.3390/biology13100798.
3
Molecular insights into regulatory RNAs in the cellular machinery.

本文引用的文献

1
Identifying Breast Cancer-Related Genes Based on a Novel Computational Framework Involving KEGG Pathways and PPI Network Modularity.基于涉及KEGG通路和PPI网络模块性的新型计算框架识别乳腺癌相关基因。
Front Genet. 2021 Aug 16;12:596794. doi: 10.3389/fgene.2021.596794. eCollection 2021.
2
iCircRBP-DHN: identification of circRNA-RBP interaction sites using deep hierarchical network.iCircRBP-DHN:使用深度层次网络识别 circRNA-RBP 相互作用位点。
Brief Bioinform. 2021 Jul 20;22(4). doi: 10.1093/bib/bbaa274.
3
A Deep Learning Framework to Predict Tumor Tissue-of-Origin Based on Copy Number Alteration.
分子层面解析细胞机制中的调控 RNA。
Exp Mol Med. 2024 Jun;56(6):1235-1249. doi: 10.1038/s12276-024-01239-6. Epub 2024 Jun 14.
4
Fuzzy kernel evidence Random Forest for identifying pseudouridine sites.基于模糊核证据的随机森林算法用于鉴定假尿嘧啶核苷位点。
Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae169.
5
Transformer Architecture and Attention Mechanisms in Genome Data Analysis: A Comprehensive Review.基因组数据分析中的Transformer架构与注意力机制:全面综述
Biology (Basel). 2023 Jul 22;12(7):1033. doi: 10.3390/biology12071033.
6
Evaluation and development of deep neural networks for RNA 5-Methyluridine classifications using autoBioSeqpy.使用autoBioSeqpy评估和开发用于RNA 5-甲基尿苷分类的深度神经网络。
Front Microbiol. 2023 May 18;14:1175925. doi: 10.3389/fmicb.2023.1175925. eCollection 2023.
7
PseU-ST: A new stacked ensemble-learning method for identifying RNA pseudouridine sites.PseU-ST:一种用于识别RNA假尿苷位点的新型堆叠集成学习方法。
Front Genet. 2023 Jan 19;14:1121694. doi: 10.3389/fgene.2023.1121694. eCollection 2023.
8
Dynamic regulation and key roles of ribonucleic acid methylation.核糖核酸甲基化的动态调控及关键作用
Front Cell Neurosci. 2022 Dec 19;16:1058083. doi: 10.3389/fncel.2022.1058083. eCollection 2022.
一种基于拷贝数改变预测肿瘤组织起源的深度学习框架。
Front Bioeng Biotechnol. 2020 Aug 5;8:701. doi: 10.3389/fbioe.2020.00701. eCollection 2020.
4
A machine learning framework to trace tumor tissue-of-origin of 13 types of cancer based on DNA somatic mutation.一种基于 DNA 体细胞突变追踪 13 种癌症肿瘤组织起源的机器学习框架。
Biochim Biophys Acta Mol Basis Dis. 2020 Nov 1;1866(11):165916. doi: 10.1016/j.bbadis.2020.165916. Epub 2020 Aug 7.
5
An Improved Anticancer Drug-Response Prediction Based on an Ensemble Method Integrating Matrix Completion and Ridge Regression.基于矩阵填充和岭回归集成方法的改进型抗癌药物反应预测
Mol Ther Nucleic Acids. 2020 Sep 4;21:676-686. doi: 10.1016/j.omtn.2020.07.003. Epub 2020 Jul 10.
6
Prediction of m5C Modifications in RNA Sequences by Combining Multiple Sequence Features.通过结合多种序列特征预测RNA序列中的m5C修饰
Mol Ther Nucleic Acids. 2020 Sep 4;21:332-342. doi: 10.1016/j.omtn.2020.06.004. Epub 2020 Jun 10.
7
Human geroprotector discovery by targeting the converging subnetworks of aging and age-related diseases.通过靶向衰老和与年龄相关疾病的汇聚子网发现人类 geroprotector。
Geroscience. 2020 Feb;42(1):353-372. doi: 10.1007/s11357-019-00106-x. Epub 2019 Oct 21.
8
A comprehensive comparison and analysis of computational predictors for RNA N6-methyladenosine sites of Saccharomyces cerevisiae.全面比较和分析酿酒酵母 RNA N6-甲基腺苷位点的计算预测因子。
Brief Funct Genomics. 2019 Nov 19;18(6):367-376. doi: 10.1093/bfgp/elz018.
9
XG-PseU: an eXtreme Gradient Boosting based method for identifying pseudouridine sites.XG-PseU:一种基于极端梯度提升的假尿嘧啶位点识别方法。
Mol Genet Genomics. 2020 Jan;295(1):13-21. doi: 10.1007/s00438-019-01600-9. Epub 2019 Aug 7.
10
DeepM6ASeq: prediction and characterization of m6A-containing sequences using deep learning.DeepM6ASeq:使用深度学习预测和描述 m6A 序列
BMC Bioinformatics. 2018 Dec 31;19(Suppl 19):524. doi: 10.1186/s12859-018-2516-4.