• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

4mCpred-EL:用于鉴定小鼠基因组中 DNA-甲基胞嘧啶位点的集成学习框架。

4mCpred-EL: An Ensemble Learning Framework for Identification of DNA -methylcytosine Sites in the Mouse Genome.

机构信息

Department of Physiology, Ajou University School of Medicine, Suwon 16499, Korea.

School of Software, Shandong University, Jinan 250101, China.

出版信息

Cells. 2019 Oct 28;8(11):1332. doi: 10.3390/cells8111332.

DOI:10.3390/cells8111332
PMID:31661923
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6912380/
Abstract

DNA -methylcytosine (4mC) is one of the key epigenetic alterations, playing essential roles in DNA replication, differentiation, cell cycle, and gene expression. To better understand 4mC biological functions, it is crucial to gain knowledge on its genomic distribution. In recent times, few computational studies, in particular machine learning (ML) approaches have been applied in the prediction of 4mC site predictions. Although ML-based methods are promising for 4mC identification in other species, none are available for detecting 4mCs in the mouse genome. Our novel computational approach, called 4mCpred-EL, is the first method for identifying 4mC sites in the mouse genome where four different ML algorithms with a wide range of seven feature encodings are utilized. Subsequently, those feature encodings predicted probabilistic values are used as a feature vector and are once again inputted to ML algorithms, whose corresponding models are integrated into ensemble learning. Our benchmarking results demonstrated that 4mCpred-EL achieved an accuracy and MCC values of 0.795 and 0.591, which significantly outperformed seven other classifiers by more than 1.5-5.9% and 3.2-11.7%, respectively. Additionally, 4mCpred-EL attained an overall accuracy of 79.80%, which is 1.8-5.1% higher than that yielded by seven other classifiers in the independent evaluation. We provided a user-friendly web server, namely 4mCpred-EL which could be implemented as a pre-screening tool for the identification of potential 4mC sites in the mouse genome.

摘要

DNA -甲基胞嘧啶(4mC)是关键的表观遗传改变之一,在 DNA 复制、分化、细胞周期和基因表达中发挥着重要作用。为了更好地理解 4mC 的生物学功能,了解其基因组分布至关重要。最近,有一些计算研究,特别是机器学习(ML)方法被应用于预测 4mC 位点的预测。虽然基于 ML 的方法在预测其他物种中的 4mC 识别方面很有前景,但在检测小鼠基因组中的 4mC 方面还没有可用的方法。我们的新计算方法称为 4mCpred-EL,是第一个用于识别小鼠基因组中 4mC 位点的方法,其中使用了四种不同的 ML 算法和七种特征编码的广泛范围。随后,这些特征编码预测的概率值被用作特征向量,并再次输入到 ML 算法中,相应的模型被集成到集成学习中。我们的基准测试结果表明,4mCpred-EL 的准确率和 MCC 值分别为 0.795 和 0.591,比其他七种分类器分别高出 1.5-5.9%和 3.2-11.7%。此外,4mCpred-EL 的总体准确率为 79.80%,比其他七种分类器在独立评估中产生的准确率高出 1.8-5.1%。我们提供了一个用户友好的网络服务器,即 4mCpred-EL,可以作为识别小鼠基因组中潜在 4mC 位点的预筛选工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d48b/6912380/3c87628083f5/cells-08-01332-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d48b/6912380/f6e86863b0a1/cells-08-01332-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d48b/6912380/440e4e0f795e/cells-08-01332-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d48b/6912380/e001eb4b02c4/cells-08-01332-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d48b/6912380/7bf5fff57772/cells-08-01332-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d48b/6912380/6b347a056a13/cells-08-01332-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d48b/6912380/3c87628083f5/cells-08-01332-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d48b/6912380/f6e86863b0a1/cells-08-01332-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d48b/6912380/440e4e0f795e/cells-08-01332-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d48b/6912380/e001eb4b02c4/cells-08-01332-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d48b/6912380/7bf5fff57772/cells-08-01332-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d48b/6912380/6b347a056a13/cells-08-01332-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d48b/6912380/3c87628083f5/cells-08-01332-g006.jpg

相似文献

1
4mCpred-EL: An Ensemble Learning Framework for Identification of DNA -methylcytosine Sites in the Mouse Genome.4mCpred-EL:用于鉴定小鼠基因组中 DNA-甲基胞嘧啶位点的集成学习框架。
Cells. 2019 Oct 28;8(11):1332. doi: 10.3390/cells8111332.
2
4mCPred-CNN-Prediction of DNA N4-Methylcytosine in the Mouse Genome Using a Convolutional Neural Network.4mCPred-CNN-使用卷积神经网络预测小鼠基因组中的 DNA N4-甲基胞嘧啶。
Genes (Basel). 2021 Feb 20;12(2):296. doi: 10.3390/genes12020296.
3
Meta-4mCpred: A Sequence-Based Meta-Predictor for Accurate DNA 4mC Site Prediction Using Effective Feature Representation.Meta-4mCpred:一种基于序列的元预测器,用于通过有效特征表示准确预测DNA 4mC位点。
Mol Ther Nucleic Acids. 2019 Jun 7;16:733-744. doi: 10.1016/j.omtn.2019.04.019. Epub 2019 Apr 30.
4
Computational identification of N4-methylcytosine sites in the mouse genome with machine-learning method.利用机器学习方法对小鼠基因组中N4-甲基胞嘧啶位点进行计算识别。
Math Biosci Eng. 2021 Apr 15;18(4):3348-3363. doi: 10.3934/mbe.2021167.
5
i4mC-EL: Identifying DNA N4-Methylcytosine Sites in the Mouse Genome Using Ensemble Learning.i4mC-EL:利用集成学习鉴定小鼠基因组中的 DNA N4-甲基胞嘧啶位点。
Biomed Res Int. 2021 May 29;2021:5515342. doi: 10.1155/2021/5515342. eCollection 2021.
6
Iterative feature representations improve N4-methylcytosine site prediction.迭代特征表示可提高 N4-甲基胞嘧啶位点预测的准确性。
Bioinformatics. 2019 Dec 1;35(23):4930-4937. doi: 10.1093/bioinformatics/btz408.
7
Exploring sequence-based features for the improved prediction of DNA N4-methylcytosine sites in multiple species.探索基于序列的特征,以提高在多个物种中预测 DNA N4-甲基胞嘧啶位点的能力。
Bioinformatics. 2019 Apr 15;35(8):1326-1333. doi: 10.1093/bioinformatics/bty824.
8
4mCPred: machine learning methods for DNA N4-methylcytosine sites prediction.4mCPred:用于 DNA N4-甲基胞嘧啶位点预测的机器学习方法。
Bioinformatics. 2019 Feb 15;35(4):593-601. doi: 10.1093/bioinformatics/bty668.
9
MultiScale-CNN-4mCPred: a multi-scale CNN and adaptive embedding-based method for mouse genome DNA N4-methylcytosine prediction.多尺度 CNN-4mCPred:一种基于多尺度 CNN 和自适应嵌入的方法,用于预测小鼠基因组 DNA N4-甲基胞嘧啶。
BMC Bioinformatics. 2023 Jan 18;24(1):21. doi: 10.1186/s12859-023-05135-0.
10
Deep4mC: systematic assessment and computational prediction for DNA N4-methylcytosine sites by deep learning.Deep4mC:通过深度学习对 DNA N4-甲基胞嘧啶位点进行系统评估和计算预测。
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa099.

引用本文的文献

1
A robust deep learning approach for identification of RNA 5-methyluridine sites.一种用于鉴定 RNA 5-甲基尿嘧啶位点的稳健深度学习方法。
Sci Rep. 2024 Oct 28;14(1):25688. doi: 10.1038/s41598-024-76148-9.
2
AACFlow: an end-to-end model based on attention augmented convolutional neural network and flow-attention mechanism for identification of anticancer peptides.AACFlow:一种基于注意力增强卷积神经网络和流注意力机制的端到端模型,用于识别抗癌肽。
Bioinformatics. 2024 Mar 4;40(3). doi: 10.1093/bioinformatics/btae142.
3
MRM-BERT: a novel deep neural network predictor of multiple RNA modifications by fusing BERT representation and sequence features.

本文引用的文献

1
SDM6A: A Web-Based Integrative Machine-Learning Framework for Predicting 6mA Sites in the Rice Genome.SDM6A:一个基于网络的用于预测水稻基因组中6mA位点的综合机器学习框架。
Mol Ther Nucleic Acids. 2019 Dec 6;18:131-141. doi: 10.1016/j.omtn.2019.08.011. Epub 2019 Aug 16.
2
AtbPpred: A Robust Sequence-Based Prediction of Anti-Tubercular Peptides Using Extremely Randomized Trees.AtbPpred:使用极端随机树对抗结核肽进行基于序列的稳健预测。
Comput Struct Biotechnol J. 2019 Jul 3;17:972-981. doi: 10.1016/j.csbj.2019.06.024. eCollection 2019.
3
Meta-4mCpred: A Sequence-Based Meta-Predictor for Accurate DNA 4mC Site Prediction Using Effective Feature Representation.
MRM-BERT:一种新颖的深度学习神经网络,通过融合 BERT 表示和序列特征,预测多种 RNA 修饰。
RNA Biol. 2024 Jan;21(1):1-10. doi: 10.1080/15476286.2024.2315384. Epub 2024 Feb 15.
4
Comparative evaluation and analysis of DNA N4-methylcytosine methylation sites using deep learning.利用深度学习对DNA N4-甲基胞嘧啶甲基化位点进行比较评估与分析
Front Genet. 2023 Aug 21;14:1254827. doi: 10.3389/fgene.2023.1254827. eCollection 2023.
5
MLACNN: an attention mechanism-based CNN architecture for predicting genome-wide DNA methylation.MLACNN:一种基于注意力机制的用于预测全基因组DNA甲基化的卷积神经网络架构。
Theory Biosci. 2023 Nov;142(4):359-370. doi: 10.1007/s12064-023-00402-3. Epub 2023 Aug 30.
6
A novel hybrid model to predict concomitant diseases for Hashimoto's thyroiditis.一种用于预测桥本甲状腺炎伴发病的新型混合模型。
BMC Bioinformatics. 2023 Aug 24;24(1):319. doi: 10.1186/s12859-023-05443-5.
7
i4mC-GRU: Identifying DNA N-Methylcytosine sites in mouse genomes using bidirectional gated recurrent unit and sequence-embedded features.i4mC-GRU:利用双向门控循环单元和序列嵌入特征识别小鼠基因组中的DNA N-甲基胞嘧啶位点。
Comput Struct Biotechnol J. 2023 May 16;21:3045-3053. doi: 10.1016/j.csbj.2023.05.014. eCollection 2023.
8
MultiScale-CNN-4mCPred: a multi-scale CNN and adaptive embedding-based method for mouse genome DNA N4-methylcytosine prediction.多尺度 CNN-4mCPred:一种基于多尺度 CNN 和自适应嵌入的方法,用于预测小鼠基因组 DNA N4-甲基胞嘧啶。
BMC Bioinformatics. 2023 Jan 18;24(1):21. doi: 10.1186/s12859-023-05135-0.
9
4acCPred: Weakly supervised prediction of -acetyldeoxycytosine DNA modification from sequences.4acCPred:从序列中对N4-乙酰脱氧胞嘧啶DNA修饰进行弱监督预测。
Mol Ther Nucleic Acids. 2022 Oct 14;30:337-345. doi: 10.1016/j.omtn.2022.10.004. eCollection 2022 Dec 13.
10
iPro-WAEL: a comprehensive and robust framework for identifying promoters in multiple species.iPro-WAEL:一种全面而强大的多物种启动子识别框架。
Nucleic Acids Res. 2022 Oct 14;50(18):10278-10289. doi: 10.1093/nar/gkac824.
Meta-4mCpred:一种基于序列的元预测器,用于通过有效特征表示准确预测DNA 4mC位点。
Mol Ther Nucleic Acids. 2019 Jun 7;16:733-744. doi: 10.1016/j.omtn.2019.04.019. Epub 2019 Apr 30.
4
Iterative feature representations improve N4-methylcytosine site prediction.迭代特征表示可提高 N4-甲基胞嘧啶位点预测的准确性。
Bioinformatics. 2019 Dec 1;35(23):4930-4937. doi: 10.1093/bioinformatics/btz408.
5
mACPpred: A Support Vector Machine-Based Meta-Predictor for Identification of Anticancer Peptides.mACPpred:一种基于支持向量机的抗癌肽元预测器。
Int J Mol Sci. 2019 Apr 22;20(8):1964. doi: 10.3390/ijms20081964.
6
Empirical comparison and analysis of web-based cell-penetrating peptide prediction tools.基于网络的细胞穿透肽预测工具的实证比较和分析。
Brief Bioinform. 2020 Mar 23;21(2):408-420. doi: 10.1093/bib/bby124.
7
i6mA-Pred: identifying DNA N6-methyladenine sites in the rice genome.i6mA-Pred:鉴定水稻基因组中的 DNA N6-甲基腺嘌呤位点。
Bioinformatics. 2019 Aug 15;35(16):2796-2800. doi: 10.1093/bioinformatics/btz015.
8
mAHTPred: a sequence-based meta-predictor for improving the prediction of anti-hypertensive peptides using effective feature representation.mAHTPred:一种基于序列的元预测器,用于使用有效的特征表示来提高抗高血压肽的预测。
Bioinformatics. 2019 Aug 15;35(16):2757-2765. doi: 10.1093/bioinformatics/bty1047.
9
ConDo: protein domain boundary prediction using coevolutionary information.ConDo:利用共进化信息进行蛋白质结构域边界预测。
Bioinformatics. 2019 Jul 15;35(14):2411-2417. doi: 10.1093/bioinformatics/bty973.
10
PredT4SE-Stack: Prediction of Bacterial Type IV Secreted Effectors From Protein Sequences Using a Stacked Ensemble Method.PredT4SE-Stack:使用堆叠集成方法从蛋白质序列预测细菌IV型分泌效应蛋白
Front Microbiol. 2018 Oct 26;9:2571. doi: 10.3389/fmicb.2018.02571. eCollection 2018.