• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

iRNA5hmC:利用机器学习识别RNA 5-羟甲基胞嘧啶修饰的首个预测工具。

iRNA5hmC: The First Predictor to Identify RNA 5-Hydroxymethylcytosine Modifications Using Machine Learning.

作者信息

Liu Yuan, Chen Dasheng, Su Ran, Chen Wei, Wei Leyi

机构信息

College of Intelligence and Computing, Tianjin University, Tianjin, China.

Center for Genomics and Computational Biology, School of Life Sciences, North China University of Science and Technology, Tangshan, China.

出版信息

Front Bioeng Biotechnol. 2020 Mar 31;8:227. doi: 10.3389/fbioe.2020.00227. eCollection 2020.

DOI:10.3389/fbioe.2020.00227
PMID:32296686
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7137033/
Abstract

RNA 5-hydroxymethylcytosine (5hmC) modification plays an important role in a series of biological processes. Characterization of its distributions in transcriptome is fundamentally important to reveal the biological functions of 5hmC. Sequencing-based technologies allow the high-throughput identification of 5hmC; however, they are labor-intensive, time-consuming, as well as expensive. Thus, there is an urgent need to develop more effective and efficient computational methods, at least complementary to the high-throughput technologies. In this study, we developed iRNA5hmC, a computational predictive protocol to identify RNA 5hmC sites using machine learning. In this predictor, we introduced a sequence-based feature algorithm consisting of two feature representations, (1) -mer spectrum and (2) positional nucleotide binary vector, to capture the sequential characteristics of 5hmC sites. Afterward, we utilized a two-stage feature space optimization strategy to improve the feature representation ability, and trained a predictive model using support vector machine (SVM). Our feature analysis results showed that feature optimization can help to capture the most discriminative features. As compared to well-known existing feature descriptors, our proposed representations can more accurately separate true 5hmC from non-5hmC sites. To the best of our knowledge, iRNA5hmC is the first RNA 5hmC predictor that enables to make predictions based on RNA primary sequences only, without any need of prior experimental knowledge. Importantly, we have established an easy-to-use webserver which is currently available at http://server.malab.cn/iRNA5hmC. We expect it has potential to be a useful tool for the prediction of 5hmC sites.

摘要

RNA 5-羟甲基胞嘧啶(5hmC)修饰在一系列生物学过程中发挥着重要作用。表征其在转录组中的分布对于揭示5hmC的生物学功能至关重要。基于测序的技术能够高通量鉴定5hmC;然而,这些技术劳动强度大、耗时且昂贵。因此,迫切需要开发更有效且高效的计算方法,至少作为高通量技术的补充。在本研究中,我们开发了iRNA5hmC,这是一种利用机器学习来识别RNA 5hmC位点的计算预测方案。在这个预测器中,我们引入了一种基于序列的特征算法,该算法由两种特征表示组成:(1)k-mer谱和(2)位置核苷酸二元向量,以捕捉5hmC位点的序列特征。随后,我们采用两阶段特征空间优化策略来提高特征表示能力,并使用支持向量机(SVM)训练预测模型。我们的特征分析结果表明,特征优化有助于捕捉最具判别力的特征。与现有的知名特征描述符相比,我们提出的表示能够更准确地将真正的5hmC与非5hmC位点区分开来。据我们所知,iRNA5hmC是首个仅基于RNA一级序列进行预测、无需任何先验实验知识的RNA 5hmC预测器。重要的是,我们建立了一个易于使用的网络服务器,目前可在http://server.malab.cn/iRNA5hmC访问。我们期望它有潜力成为预测5hmC位点的有用工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ab7/7137033/7deb2d7e21fc/fbioe-08-00227-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ab7/7137033/09363f44ef7e/fbioe-08-00227-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ab7/7137033/3af4862f394c/fbioe-08-00227-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ab7/7137033/7deb2d7e21fc/fbioe-08-00227-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ab7/7137033/09363f44ef7e/fbioe-08-00227-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ab7/7137033/3af4862f394c/fbioe-08-00227-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ab7/7137033/7deb2d7e21fc/fbioe-08-00227-g003.jpg

相似文献

1
iRNA5hmC: The First Predictor to Identify RNA 5-Hydroxymethylcytosine Modifications Using Machine Learning.iRNA5hmC:利用机器学习识别RNA 5-羟甲基胞嘧啶修饰的首个预测工具。
Front Bioeng Biotechnol. 2020 Mar 31;8:227. doi: 10.3389/fbioe.2020.00227. eCollection 2020.
2
Accurate prediction of RNA 5-hydroxymethylcytosine modification by utilizing novel position-specific gapped k-mer descriptors.利用新型位置特异性间隔k-mer描述符准确预测RNA 5-羟甲基胞嘧啶修饰
Comput Struct Biotechnol J. 2020 Nov 12;18:3528-3538. doi: 10.1016/j.csbj.2020.10.032. eCollection 2020.
3
iR5hmcSC: Identifying RNA 5-hydroxymethylcytosine with multiple features based on stacking learning.iR5hmcSC:基于堆叠学习利用多种特征识别RNA 5-羟甲基胞嘧啶
Comput Biol Chem. 2021 Dec;95:107583. doi: 10.1016/j.compbiolchem.2021.107583. Epub 2021 Sep 20.
4
Exploring sequence-based features for the improved prediction of DNA N4-methylcytosine sites in multiple species.探索基于序列的特征,以提高在多个物种中预测 DNA N4-甲基胞嘧啶位点的能力。
Bioinformatics. 2019 Apr 15;35(8):1326-1333. doi: 10.1093/bioinformatics/bty824.
5
A Bioinformatics Tool for the Prediction of DNA N6-Methyladenine Modifications Based on Feature Fusion and Optimization Protocol.一种基于特征融合与优化协议的DNA N6-甲基腺嘌呤修饰预测的生物信息学工具。
Front Bioeng Biotechnol. 2020 Jun 4;8:502. doi: 10.3389/fbioe.2020.00502. eCollection 2020.
6
M6AMRFS: Robust Prediction of N6-Methyladenosine Sites With Sequence-Based Features in Multiple Species.M6AMRFS:利用多物种基于序列的特征对N6-甲基腺苷位点进行稳健预测
Front Genet. 2018 Oct 25;9:495. doi: 10.3389/fgene.2018.00495. eCollection 2018.
7
Iterative feature representations improve N4-methylcytosine site prediction.迭代特征表示可提高 N4-甲基胞嘧啶位点预测的准确性。
Bioinformatics. 2019 Dec 1;35(23):4930-4937. doi: 10.1093/bioinformatics/btz408.
8
iRNA5hmC-HOC: High-order correlation information for identifying RNA 5-hydroxymethylcytosine modification.iRNA5hmC-HOC:用于鉴定 RNA 5-羟甲基胞嘧啶修饰的高阶相关信息。
J Bioinform Comput Biol. 2022 Aug;20(4):2250017. doi: 10.1142/S0219720022500172. Epub 2022 Aug 3.
9
Comparative analysis and prediction of quorum-sensing peptides using feature representation learning and machine learning algorithms.使用特征表示学习和机器学习算法对群体感应肽进行比较分析和预测。
Brief Bioinform. 2020 Jan 17;21(1):106-119. doi: 10.1093/bib/bby107.
10
Developing a Multi-Layer Deep Learning Based Predictive Model to Identify DNA N4-Methylcytosine Modifications.开发一种基于多层深度学习的预测模型以识别DNA N4-甲基胞嘧啶修饰。
Front Bioeng Biotechnol. 2020 Apr 21;8:274. doi: 10.3389/fbioe.2020.00274. eCollection 2020.

引用本文的文献

1
RNA modifications and their role in gene expression.RNA修饰及其在基因表达中的作用。
Front Mol Biosci. 2025 Apr 25;12:1537861. doi: 10.3389/fmolb.2025.1537861. eCollection 2025.
2
A hybrid residue based sequential encoding mechanism with XGBoost improved ensemble model for identifying 5-hydroxymethylcytosine modifications.基于残基的混合序贯编码机制与 XGBoost 改进的集成模型用于识别 5-羟甲基胞嘧啶修饰。
Sci Rep. 2024 Sep 6;14(1):20819. doi: 10.1038/s41598-024-71568-z.
3
DeepPGD: A Deep Learning Model for DNA Methylation Prediction Using Temporal Convolution, BiLSTM, and Attention Mechanism.

本文引用的文献

1
MotifCNN-fold: protein fold recognition based on fold-specific features extracted by motif-based convolutional neural networks.MotifCNN-fold:基于基于模体的卷积神经网络提取的折叠特异特征的蛋白质折叠识别。
Brief Bioinform. 2020 Dec 1;21(6):2133-2141. doi: 10.1093/bib/bbz133.
2
iRNA-m7G: Identifying N-methylguanosine Sites by Fusing Multiple Features.iRNA-m7G:通过融合多种特征识别N-甲基鸟苷位点
Mol Ther Nucleic Acids. 2019 Dec 6;18:269-274. doi: 10.1016/j.omtn.2019.08.022. Epub 2019 Aug 28.
3
A Random Forest Sub-Golgi Protein Classifier Optimized via Dipeptide and Amino Acid Composition Features.
深度 PG-D:一种基于时间卷积、BiLSTM 和注意力机制的 DNA 甲基化深度学习预测模型。
Int J Mol Sci. 2024 Jul 26;25(15):8146. doi: 10.3390/ijms25158146.
4
Exploring the epigenetic landscape: The role of 5-hydroxymethylcytosine in neurodevelopmental disorders.探索表观遗传景观:5-羟甲基胞嘧啶在神经发育障碍中的作用。
Camb Prism Precis Med. 2024 Apr 1;2:e5. doi: 10.1017/pcm.2024.2. eCollection 2024.
5
Sequence based model using deep neural network and hybrid features for identification of 5-hydroxymethylcytosine modification.基于序列的深度学习神经网络模型和混合特征用于 5-羟甲基胞嘧啶修饰的识别。
Sci Rep. 2024 Apr 20;14(1):9116. doi: 10.1038/s41598-024-59777-y.
6
i5mC-DCGA: an improved hybrid network framework based on the CBAM attention mechanism for identifying promoter 5mC sites.i5mC-DCGA:一种基于 CBAM 注意力机制的改进型混合网络框架,用于识别启动子 5mC 位点。
BMC Genomics. 2024 Mar 5;25(1):242. doi: 10.1186/s12864-024-10154-z.
7
Dynamic regulation and key roles of ribonucleic acid methylation.核糖核酸甲基化的动态调控及关键作用
Front Cell Neurosci. 2022 Dec 19;16:1058083. doi: 10.3389/fncel.2022.1058083. eCollection 2022.
8
RNADSN: Transfer-Learning 5-Methyluridine (mU) Modification on mRNAs from Common Features of tRNA.RNA 二硫键稳定结构核酸酶:从 tRNA 的常见特征转移学习 mRNAs 上的 5-甲基尿嘧啶(mU)修饰。
Int J Mol Sci. 2022 Nov 4;23(21):13493. doi: 10.3390/ijms232113493.
9
i5hmCVec: Identifying 5-Hydroxymethylcytosine Sites of RNA Using Sequence Feature Embeddings.i5hmCVec:利用序列特征嵌入识别RNA的5-羟甲基胞嘧啶位点
Front Genet. 2022 May 3;13:896925. doi: 10.3389/fgene.2022.896925. eCollection 2022.
10
Research on the Computational Prediction of Essential Genes.必需基因的计算预测研究
Front Cell Dev Biol. 2021 Dec 6;9:803608. doi: 10.3389/fcell.2021.803608. eCollection 2021.
一种通过二肽和氨基酸组成特征优化的随机森林亚高尔基体蛋白分类器。
Front Bioeng Biotechnol. 2019 Sep 4;7:215. doi: 10.3389/fbioe.2019.00215. eCollection 2019.
4
iPromoter-2L2.0: Identifying Promoters and Their Types by Combining Smoothing Cutting Window Algorithm and Sequence-Based Features.iPromoter-2L2.0:结合平滑切割窗口算法和基于序列的特征识别启动子及其类型
Mol Ther Nucleic Acids. 2019 Dec 6;18:80-87. doi: 10.1016/j.omtn.2019.08.008. Epub 2019 Aug 14.
5
BioSeq-Analysis2.0: an updated platform for analyzing DNA, RNA and protein sequences at sequence level and residue level based on machine learning approaches.BioSeq-Analysis2.0:一个基于机器学习方法的更新平台,用于在序列水平和残基水平上分析 DNA、RNA 和蛋白质序列。
Nucleic Acids Res. 2019 Nov 18;47(20):e127. doi: 10.1093/nar/gkz740.
6
MM-6mAPred: identifying DNA N6-methyladenine sites based on Markov model.MM-6mAPred:基于马尔可夫模型识别 DNA N6-甲基腺嘌呤位点。
Bioinformatics. 2020 Jan 15;36(2):388-392. doi: 10.1093/bioinformatics/btz556.
7
Incorporating Distance-Based Top-n-gram and Random Forest To Identify Electron Transport Proteins.基于距离的 Top-n-gram 和随机森林在鉴定电子传递蛋白中的应用。
J Proteome Res. 2019 Jul 5;18(7):2931-2939. doi: 10.1021/acs.jproteome.9b00250. Epub 2019 Jun 3.
8
Where, When, and How: Context-Dependent Functions of RNA Methylation Writers, Readers, and Erasers.在哪里、何时以及如何:RNA 甲基化写入器、读取器和擦除器的上下文相关功能。
Mol Cell. 2019 May 16;74(4):640-650. doi: 10.1016/j.molcel.2019.04.025.
9
Iterative feature representations improve N4-methylcytosine site prediction.迭代特征表示可提高 N4-甲基胞嘧啶位点预测的准确性。
Bioinformatics. 2019 Dec 1;35(23):4930-4937. doi: 10.1093/bioinformatics/btz408.
10
Bisulfite-free and base-resolution analysis of 5-methylcytidine and 5-hydroxymethylcytidine in RNA with peroxotungstate.过钨酸盐法对 RNA 中 5-甲基胞嘧啶和 5-羟甲基胞嘧啶的无亚硫酸氢盐和碱基分辨率分析
Chem Commun (Camb). 2019 Feb 19;55(16):2328-2331. doi: 10.1039/c9cc00274j.