• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

i4mC-EL:利用集成学习鉴定小鼠基因组中的 DNA N4-甲基胞嘧啶位点。

i4mC-EL: Identifying DNA N4-Methylcytosine Sites in the Mouse Genome Using Ensemble Learning.

机构信息

College of Information and Computer Engineering, Northeast Forestry University, Harbin 150040, China.

出版信息

Biomed Res Int. 2021 May 29;2021:5515342. doi: 10.1155/2021/5515342. eCollection 2021.

DOI:10.1155/2021/5515342
PMID:34159192
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8187051/
Abstract

As one of important epigenetic modifications, DNA N4-methylcytosine (4mC) plays a crucial role in controlling gene replication, expression, cell cycle, DNA replication, and differentiation. The accurate identification of 4mC sites is necessary to understand biological functions. In the paper, we use ensemble learning to develop a model named i4mC-EL to identify 4mC sites in the mouse genome. Firstly, a multifeature encoding scheme consisting of Kmer and EIIP was adopted to describe the DNA sequences. Secondly, on the basis of the multifeature encoding scheme, we developed a stacked ensemble model, in which four machine learning algorithms, namely, BayesNet, NaiveBayes, LibSVM, and Voted Perceptron, were utilized to implement an ensemble of base classifiers that produce intermediate results as input of the metaclassifier, Logistic. The experimental results on the independent test dataset demonstrate that the overall rate of predictive accurate of i4mC-EL is 82.19%, which is better than the existing methods. The user-friendly website implementing i4mC-EL can be accessed freely at the following.

摘要

作为重要的表观遗传修饰之一,DNA N4-甲基胞嘧啶(4mC)在控制基因复制、表达、细胞周期、DNA 复制和分化方面起着关键作用。准确识别 4mC 位点对于理解生物学功能是必要的。在本文中,我们使用集成学习开发了一种名为 i4mC-EL 的模型,用于识别小鼠基因组中的 4mC 位点。首先,采用了一种由 Kmer 和 EIIP 组成的多特征编码方案来描述 DNA 序列。其次,在多特征编码方案的基础上,我们开发了一个堆叠集成模型,其中使用了四种机器学习算法,即贝叶斯网络、朴素贝叶斯、LibSVM 和投票感知机,来实现基本分类器的集成,这些基本分类器的中间结果作为元分类器 Logistic 的输入。在独立测试数据集上的实验结果表明,i4mC-EL 的总体预测准确率为 82.19%,优于现有方法。i4mC-EL 的用户友好型网站可在以下网址免费访问。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63bd/8187051/044d758b9653/BMRI2021-5515342.006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63bd/8187051/3216c033ebb1/BMRI2021-5515342.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63bd/8187051/2baa53bac7ba/BMRI2021-5515342.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63bd/8187051/aa249ddb2a83/BMRI2021-5515342.003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63bd/8187051/d4c26c5ce1b0/BMRI2021-5515342.004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63bd/8187051/bb74defd7381/BMRI2021-5515342.005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63bd/8187051/044d758b9653/BMRI2021-5515342.006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63bd/8187051/3216c033ebb1/BMRI2021-5515342.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63bd/8187051/2baa53bac7ba/BMRI2021-5515342.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63bd/8187051/aa249ddb2a83/BMRI2021-5515342.003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63bd/8187051/d4c26c5ce1b0/BMRI2021-5515342.004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63bd/8187051/bb74defd7381/BMRI2021-5515342.005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63bd/8187051/044d758b9653/BMRI2021-5515342.006.jpg

相似文献

1
i4mC-EL: Identifying DNA N4-Methylcytosine Sites in the Mouse Genome Using Ensemble Learning.i4mC-EL:利用集成学习鉴定小鼠基因组中的 DNA N4-甲基胞嘧啶位点。
Biomed Res Int. 2021 May 29;2021:5515342. doi: 10.1155/2021/5515342. eCollection 2021.
2
i4mC-ROSE, a bioinformatics tool for the identification of DNA N4-methylcytosine sites in the Rosaceae genome.i4mC-ROSE,一种用于鉴定蔷薇科基因组中 DNA N4-甲基胞嘧啶位点的生物信息学工具。
Int J Biol Macromol. 2020 Aug 15;157:752-758. doi: 10.1016/j.ijbiomac.2019.12.009. Epub 2019 Dec 2.
3
4mCpred-EL: An Ensemble Learning Framework for Identification of DNA -methylcytosine Sites in the Mouse Genome.4mCpred-EL:用于鉴定小鼠基因组中 DNA-甲基胞嘧啶位点的集成学习框架。
Cells. 2019 Oct 28;8(11):1332. doi: 10.3390/cells8111332.
4
i4mC-Mouse: Improved identification of DNA N4-methylcytosine sites in the mouse genome using multiple encoding schemes.i4mC-Mouse:使用多种编码方案改进对小鼠基因组中DNA N4-甲基胞嘧啶位点的识别。
Comput Struct Biotechnol J. 2020 Apr 8;18:906-912. doi: 10.1016/j.csbj.2020.04.001. eCollection 2020.
5
Computational identification of N4-methylcytosine sites in the mouse genome with machine-learning method.利用机器学习方法对小鼠基因组中N4-甲基胞嘧啶位点进行计算识别。
Math Biosci Eng. 2021 Apr 15;18(4):3348-3363. doi: 10.3934/mbe.2021167.
6
i4mC-Deep: An Intelligent Predictor of N4-Methylcytosine Sites Using a Deep Learning Approach with Chemical Properties.i4mC-Deep:一种基于深度学习方法并结合化学性质预测 N4-甲基胞嘧啶位点的智能预测器。
Genes (Basel). 2021 Jul 23;12(8):1117. doi: 10.3390/genes12081117.
7
4mCPred-CNN-Prediction of DNA N4-Methylcytosine in the Mouse Genome Using a Convolutional Neural Network.4mCPred-CNN-使用卷积神经网络预测小鼠基因组中的 DNA N4-甲基胞嘧啶。
Genes (Basel). 2021 Feb 20;12(2):296. doi: 10.3390/genes12020296.
8
Deep4mC: systematic assessment and computational prediction for DNA N4-methylcytosine sites by deep learning.Deep4mC:通过深度学习对 DNA N4-甲基胞嘧啶位点进行系统评估和计算预测。
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa099.
9
Iterative feature representations improve N4-methylcytosine site prediction.迭代特征表示可提高 N4-甲基胞嘧啶位点预测的准确性。
Bioinformatics. 2019 Dec 1;35(23):4930-4937. doi: 10.1093/bioinformatics/btz408.
10
i4mC-GRU: Identifying DNA N-Methylcytosine sites in mouse genomes using bidirectional gated recurrent unit and sequence-embedded features.i4mC-GRU:利用双向门控循环单元和序列嵌入特征识别小鼠基因组中的DNA N-甲基胞嘧啶位点。
Comput Struct Biotechnol J. 2023 May 16;21:3045-3053. doi: 10.1016/j.csbj.2023.05.014. eCollection 2023.

引用本文的文献

1
A Grid Search-Based Multilayer Dynamic Ensemble System to Identify DNA N4-Methylcytosine Using Deep Learning Approach.基于网格搜索的多层动态集成系统,利用深度学习方法识别 DNA N4-甲基胞嘧啶。
Genes (Basel). 2023 Feb 25;14(3):582. doi: 10.3390/genes14030582.
2
MultiScale-CNN-4mCPred: a multi-scale CNN and adaptive embedding-based method for mouse genome DNA N4-methylcytosine prediction.多尺度 CNN-4mCPred:一种基于多尺度 CNN 和自适应嵌入的方法,用于预测小鼠基因组 DNA N4-甲基胞嘧啶。
BMC Bioinformatics. 2023 Jan 18;24(1):21. doi: 10.1186/s12859-023-05135-0.
3
Systematic Analysis and Accurate Identification of DNA N4-Methylcytosine Sites by Deep Learning.

本文引用的文献

1
iBLP: An XGBoost-Based Predictor for Identifying Bioluminescent Proteins.iBLP:一种基于 XGBoost 的生物发光蛋白鉴定预测器。
Comput Math Methods Med. 2021 Jan 7;2021:6664362. doi: 10.1155/2021/6664362. eCollection 2021.
2
DM3Loc: multi-label mRNA subcellular localization prediction and analysis based on multi-head self-attention mechanism.DM3Loc:基于多头自注意力机制的多标签 mRNA 亚细胞定位预测与分析。
Nucleic Acids Res. 2021 May 7;49(8):e46. doi: 10.1093/nar/gkab016.
3
DeepYY1: a deep learning approach to identify YY1-mediated chromatin loops.
基于深度学习的DNA N4-甲基胞嘧啶位点的系统分析与准确识别
Front Microbiol. 2022 Mar 15;13:843425. doi: 10.3389/fmicb.2022.843425. eCollection 2022.
DeepYY1:一种用于识别 YY1 介导的染色质环的深度学习方法。
Brief Bioinform. 2021 Jul 20;22(4). doi: 10.1093/bib/bbaa356.
4
SelfAT-Fold: Protein Fold Recognition Based on Residue-Based and Motif-Based Self-Attention Networks.SelfAT-Fold:基于残基和模体的自注意力网络的蛋白质折叠识别。
IEEE/ACM Trans Comput Biol Bioinform. 2022 May-Jun;19(3):1861-1869. doi: 10.1109/TCBB.2020.3031888. Epub 2022 Jun 3.
5
ProtFold-DFG: protein fold recognition by combining Directed Fusion Graph and PageRank algorithm.ProtFold-DFG:通过结合定向融合图和 PageRank 算法进行蛋白质折叠识别。
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa192.
6
IDP-Seq2Seq: identification of intrinsically disordered regions based on sequence to sequence learning.IDP-Seq2Seq:基于序列到序列学习的无规卷曲区域鉴定。
Bioinformatics. 2021 Jan 29;36(21):5177-5186. doi: 10.1093/bioinformatics/btaa667.
7
FoldRec-C2C: protein fold recognition by combining cluster-to-cluster model and protein similarity network.FoldRec-C2C:通过组合簇到簇模型和蛋白质相似性网络进行蛋白质折叠识别。
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa144.
8
PPTPP: a novel therapeutic peptide prediction method using physicochemical property encoding and adaptive feature representation learning.PPTPP:一种利用物化性质编码和自适应特征表示学习的新型治疗性肽预测方法。
Bioinformatics. 2020 Jul 1;36(13):3982-3987. doi: 10.1093/bioinformatics/btaa275.
9
i4mC-Mouse: Improved identification of DNA N4-methylcytosine sites in the mouse genome using multiple encoding schemes.i4mC-Mouse:使用多种编码方案改进对小鼠基因组中DNA N4-甲基胞嘧啶位点的识别。
Comput Struct Biotechnol J. 2020 Apr 8;18:906-912. doi: 10.1016/j.csbj.2020.04.001. eCollection 2020.
10
iDNA-MS: An Integrated Computational Tool for Detecting DNA Modification Sites in Multiple Genomes.iDNA-MS:一种用于检测多个基因组中DNA修饰位点的综合计算工具。
iScience. 2020 Apr 24;23(4):100991. doi: 10.1016/j.isci.2020.100991. Epub 2020 Mar 19.