Suppr
超能文献

EDLmAPred：用于预测 mRNA mA 位点的集成深度学习方法。

EDLmAPred: ensemble deep learning approach for mRNA mA site prediction.

机构信息

Engineering Research Center of Intelligent Control for Underground Space, Ministry of Education, China University of Mining and Technology, Xuzhou, 221116, China.

School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China.

出版信息

BMC Bioinformatics. 2021 May 29;22(1):288. doi: 10.1186/s12859-021-04206-4.

DOI:10.1186/s12859-021-04206-4

PMID:34051729

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8164815/

Abstract

BACKGROUND

As a common and abundant RNA methylation modification, N6-methyladenosine (mA) is widely spread in various species' transcriptomes, and it is closely related to the occurrence and development of various life processes and diseases. Thus, accurate identification of mA methylation sites has become a hot topic. Most biological methods rely on high-throughput sequencing technology, which places great demands on the sequencing library preparation and data analysis. Thus, various machine learning methods have been proposed to extract various types of features based on sequences, then occupied conventional classifiers, such as SVM, RF, etc., for mA methylation site identification. However, the identification performance relies heavily on the extracted features, which still need to be improved.

RESULTS

This paper mainly studies feature extraction and classification of mA methylation sites in a natural language processing way, which manages to organically integrate the feature extraction and classification simultaneously, with consideration of upstream and downstream information of mA sites. One-hot, RNA word embedding, and Word2vec are adopted to depict sites from the perspectives of the base as well as its upstream and downstream sequence. The BiLSTM model, a well-known sequence model, was then constructed to discriminate the sequences with potential mA sites. Since the above-mentioned three feature extraction methods focus on different perspectives of mA sites, an ensemble deep learning predictor (EDLmAPred) was finally constructed for mA site prediction. Experimental results on human and mouse data sets show that EDLmAPred outperforms the other single ones, indicating that base, upstream, and downstream information are all essential for mA site detection. Compared with the existing mA methylation site prediction models without genomic features, EDLmAPred obtains 86.6% of the area under receiver operating curve on the human data sets, indicating the effectiveness of sequential modeling on RNA. To maximize user convenience, a webserver was developed as an implementation of EDLmAPred and made publicly available at www.xjtlu.edu.cn/biologicalsciences/EDLm6APred .

CONCLUSIONS

Our proposed EDLmAPred method is a reliable predictor for mA methylation sites.

摘要

背景

作为一种常见且丰富的 RNA 甲基化修饰，N6-甲基腺苷（m6A）广泛存在于各种物种的转录组中，与各种生命过程和疾病的发生发展密切相关。因此，准确识别 m6A 甲基化位点已成为研究热点。大多数生物学方法依赖于高通量测序技术，这对测序文库的制备和数据分析提出了很高的要求。因此，各种机器学习方法被提出，以基于序列提取各种类型的特征，然后占据传统的分类器，如 SVM、RF 等，用于 m6A 甲基化位点的识别。然而，识别性能严重依赖于所提取的特征，这些特征仍需要改进。

结果

本文主要研究 m6A 甲基化位点的自然语言处理特征提取和分类，有机地将特征提取和分类同时进行，考虑了 m6A 位点的上下游信息。采用独热编码、RNA 单词嵌入和 Word2vec 从碱基及其上下游序列的角度来描述位点。然后，构建了著名的序列模型 BiLSTM 来区分具有潜在 m6A 位点的序列。由于上述三种特征提取方法关注 m6A 位点的不同视角，最终构建了一个集成深度学习预测器（EDLmAPred）用于 m6A 位点预测。在人类和小鼠数据集上的实验结果表明，EDLmAPred 优于其他单一方法，表明碱基、上游和下游信息对于 m6A 位点检测都是必不可少的。与没有基因组特征的现有 m6A 甲基化位点预测模型相比，EDLmAPred 在人类数据集上获得了 86.6%的接收器操作曲线下面积，表明在 RNA 上进行序列建模的有效性。为了最大程度地方便用户，我们开发了一个网络服务器作为 EDLmAPred 的实现，并在 www.xjtlu.edu.cn/biologicalsciences/EDLm6APred 上公开发布。

结论

我们提出的 EDLmAPred 方法是一种可靠的 m6A 甲基化位点预测器。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c6f9/8164815/1720ba14963a/12859_2021_4206_Fig1_HTML.jpg

相似文献

EDLmAPred: ensemble deep learning approach for mRNA mA site prediction.

BMC Bioinformatics. 2021 May 29;22(1):288. doi: 10.1186/s12859-021-04206-4.

EMDLP: Ensemble multiscale deep learning model for RNA methylation site prediction.

BMC Bioinformatics. 2022 Jun 8;23(1):221. doi: 10.1186/s12859-022-04756-1.

Comprehensive review and assessment of computational methods for predicting RNA post-transcriptional modification sites from RNA sequences.

Brief Bioinform. 2020 Sep 25;21(5):1676-1696. doi: 10.1093/bib/bbz112.

Predicting N6-Methyladenosine Sites in Multiple Tissues of Mammals through Ensemble Deep Learning.

Int J Mol Sci. 2022 Dec 7;23(24):15490. doi: 10.3390/ijms232415490.

Gene2vec: gene subsequence embedding for prediction of mammalian -methyladenosine sites from mRNA.

RNA. 2019 Feb;25(2):205-218. doi: 10.1261/rna.069112.118. Epub 2018 Nov 13.

Computational identification of N6-methyladenosine sites in multiple tissues of mammals.

Comput Struct Biotechnol J. 2020 Apr 30;18:1084-1091. doi: 10.1016/j.csbj.2020.04.015. eCollection 2020.

WHISTLE: a high-accuracy map of the human N6-methyladenosine (m6A) epitranscriptome predicted using a machine learning approach.

Nucleic Acids Res. 2019 Apr 23;47(7):e41. doi: 10.1093/nar/gkz074.

BERMP: a cross-species classifier for predicting mA sites by integrating a deep learning algorithm and a random forest approach.

Int J Biol Sci. 2018 Sep 7;14(12):1669-1677. doi: 10.7150/ijbs.27819. eCollection 2018.

DeepM6ASeq: prediction and characterization of m6A-containing sequences using deep learning.

BMC Bioinformatics. 2018 Dec 31;19(Suppl 19):524. doi: 10.1186/s12859-018-2516-4.

WHISTLE: A Functionally Annotated High-Accuracy Map of Human mA Epitranscriptome.

Methods Mol Biol. 2021;2284:519-529. doi: 10.1007/978-1-0716-1307-8_28.

引用本文的文献

Hybrid representation learning for human mA modifications with chromosome-level generalizability.

Bioinform Adv. 2025 Jul 14;5(1):vbaf170. doi: 10.1093/bioadv/vbaf170. eCollection 2025.

Machine learning-augmented m6A-Seq analysis without a reference genome.

Brief Bioinform. 2025 May 1;26(3). doi: 10.1093/bib/bbaf235.

Genome language modeling (GLM): a beginner's cheat sheet.

Biol Methods Protoc. 2025 Mar 25;10(1):bpaf022. doi: 10.1093/biomethods/bpaf022. eCollection 2025.

RNA-ModX: a multilabel prediction and interpretation framework for RNA modifications.

Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae688.

MSCAN: multi-scale self- and cross-attention network for RNA methylation site prediction.

BMC Bioinformatics. 2024 Jan 17;25(1):32. doi: 10.1186/s12859-024-05649-1.

EnsembleDL-ATG: Identifying autophagy proteins by integrating their sequence and evolutionary information using an ensemble deep learning framework.

Comput Struct Biotechnol J. 2023 Sep 29;21:4836-4848. doi: 10.1016/j.csbj.2023.09.036. eCollection 2023.

Dynamic regulation and key roles of ribonucleic acid methylation.

Front Cell Neurosci. 2022 Dec 19;16:1058083. doi: 10.3389/fncel.2022.1058083. eCollection 2022.

Analysis approaches for the identification and prediction of -methyladenosine sites.

Epigenetics. 2023 Dec;18(1):2158284. doi: 10.1080/15592294.2022.2158284. Epub 2022 Dec 23.

4acCPred: Weakly supervised prediction of -acetyldeoxycytosine DNA modification from sequences.

Mol Ther Nucleic Acids. 2022 Oct 14;30:337-345. doi: 10.1016/j.omtn.2022.10.004. eCollection 2022 Dec 13.

EMDLP: Ensemble multiscale deep learning model for RNA methylation site prediction.

BMC Bioinformatics. 2022 Jun 8;23(1):221. doi: 10.1186/s12859-022-04756-1.

本文引用的文献

Comprehensive review and assessment of computational methods for predicting RNA post-transcriptional modification sites from RNA sequences.

Brief Bioinform. 2020 Sep 25;21(5):1676-1696. doi: 10.1093/bib/bbz112.

DeePromoter: Robust Promoter Predictor Using Deep Learning.

Front Genet. 2019 Apr 5;10:286. doi: 10.3389/fgene.2019.00286. eCollection 2019.

WHISTLE: a high-accuracy map of the human N6-methyladenosine (m6A) epitranscriptome predicted using a machine learning approach.

Nucleic Acids Res. 2019 Apr 23;47(7):e41. doi: 10.1093/nar/gkz074.

DeepM6ASeq: prediction and characterization of m6A-containing sequences using deep learning.

BMC Bioinformatics. 2018 Dec 31;19(Suppl 19):524. doi: 10.1186/s12859-018-2516-4.

Gene2vec: gene subsequence embedding for prediction of mammalian -methyladenosine sites from mRNA.

RNA. 2019 Feb;25(2):205-218. doi: 10.1261/rna.069112.118. Epub 2018 Nov 13.

BERMP: a cross-species classifier for predicting mA sites by integrating a deep learning algorithm and a random forest approach.

Int J Biol Sci. 2018 Sep 7;14(12):1669-1677. doi: 10.7150/ijbs.27819. eCollection 2018.

Ensembl 2018.

Nucleic Acids Res. 2018 Jan 4;46(D1):D754-D761. doi: 10.1093/nar/gkx1098.

Essential role of METTL3-mediated mA modification in glioma stem-like cells maintenance and radioresistance.

Oncogene. 2018 Jan 25;37(4):522-533. doi: 10.1038/onc.2017.351. Epub 2017 Oct 9.

Sequence2Vec: a novel embedding approach for modeling transcription factor binding affinity landscape.

Bioinformatics. 2017 Nov 15;33(22):3575-3583. doi: 10.1093/bioinformatics/btx480.

mA modulates haematopoietic stem and progenitor cell specification.

Nature. 2017 Sep 14;549(7671):273-276. doi: 10.1038/nature23883. Epub 2017 Sep 6.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr超能文献

EDLmAPred：用于预测 mRNA mA 位点的集成深度学习方法。

EDLmAPred: ensemble deep learning approach for mRNA mA site prediction.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译