• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

i4mC-Mouse:使用多种编码方案改进对小鼠基因组中DNA N4-甲基胞嘧啶位点的识别。

i4mC-Mouse: Improved identification of DNA N4-methylcytosine sites in the mouse genome using multiple encoding schemes.

作者信息

Hasan Md Mehedi, Manavalan Balachandran, Shoombuatong Watshara, Khatun Mst Shamima, Kurata Hiroyuki

机构信息

Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan.

Japan Society for the Promotion of Science, 5-3-1 Kojimachi, Chiyoda-ku, Tokyo 102-0083, Japan.

出版信息

Comput Struct Biotechnol J. 2020 Apr 8;18:906-912. doi: 10.1016/j.csbj.2020.04.001. eCollection 2020.

DOI:10.1016/j.csbj.2020.04.001
PMID:32322372
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7168350/
Abstract

4-methylcytosine (4mC) is one of the most important DNA modifications and involved in regulating cell differentiations and gene expressions. The accurate identification of 4mC sites is necessary to understand various biological functions. In this work, we developed a new computational predictor called i4mC-Mouse to identify 4mC sites in the mouse genome. Herein, six encoding schemes of k-space nucleotide composition (KSNC), k-mer nucleotide composition (Kmer), mono nucleotide binary encoding (MBE), dinucleotide binary encoding, electron-ion interaction pseudo potentials (EIIP) and dinucleotide physicochemical composition were explored that cover different characteristics of DNA sequence information. Subsequently, we built six RF-based encoding models and then linearly combined their probability scores to construct the final predictor. Among the six RF-based models, the Kmer, KSNC, MBE, and EIIP encodings are sufficient, which contributed to 10%, 45%, 25%, and 20% of the prediction performance, respectively. On the independent test the i4mC-Mouse predicted the 4mC sites with accuracy and MCC of 0.816 and 0.633, respectively, which were approximately 2.5% and 5% higher than those of the existing method (4mCpred-EL). For experimental biologists, a freely available web application was implemented at http://kurata14.bio.kyutech.ac.jp/i4mC-Mouse/.

摘要

4-甲基胞嘧啶(4mC)是最重要的DNA修饰之一,参与调节细胞分化和基因表达。准确识别4mC位点对于理解各种生物学功能至关重要。在这项工作中,我们开发了一种名为i4mC-Mouse的新型计算预测工具,用于识别小鼠基因组中的4mC位点。在此,我们探索了六种编码方案,包括k空间核苷酸组成(KSNC)、k-mer核苷酸组成(Kmer)、单核苷酸二进制编码(MBE)、二核苷酸二进制编码、电子-离子相互作用伪势(EIIP)和二核苷酸物理化学组成,这些方案涵盖了DNA序列信息的不同特征。随后,我们构建了六个基于随机森林(RF)的编码模型,然后将它们的概率得分进行线性组合,以构建最终的预测工具。在六个基于RF的模型中,Kmer、KSNC、MBE和EIIP编码是充分的,它们分别对预测性能贡献了10%、45%、25%和20%。在独立测试中,i4mC-Mouse预测4mC位点的准确率和马修斯相关系数(MCC)分别为0.816和0.633,比现有方法(4mCpred-EL)高出约2.5%和5%。对于实验生物学家,我们在http://kurata14.bio.kyutech.ac.jp/i4mC-Mouse/上实现了一个免费的网络应用程序。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/765d/7168350/dd7f1d6510c2/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/765d/7168350/355c8048b128/ga1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/765d/7168350/ba9cc551d57c/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/765d/7168350/b70cf5ed0836/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/765d/7168350/65b8aab9a524/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/765d/7168350/dd7f1d6510c2/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/765d/7168350/355c8048b128/ga1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/765d/7168350/ba9cc551d57c/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/765d/7168350/b70cf5ed0836/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/765d/7168350/65b8aab9a524/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/765d/7168350/dd7f1d6510c2/gr4.jpg

相似文献

1
i4mC-Mouse: Improved identification of DNA N4-methylcytosine sites in the mouse genome using multiple encoding schemes.i4mC-Mouse:使用多种编码方案改进对小鼠基因组中DNA N4-甲基胞嘧啶位点的识别。
Comput Struct Biotechnol J. 2020 Apr 8;18:906-912. doi: 10.1016/j.csbj.2020.04.001. eCollection 2020.
2
i4mC-ROSE, a bioinformatics tool for the identification of DNA N4-methylcytosine sites in the Rosaceae genome.i4mC-ROSE,一种用于鉴定蔷薇科基因组中 DNA N4-甲基胞嘧啶位点的生物信息学工具。
Int J Biol Macromol. 2020 Aug 15;157:752-758. doi: 10.1016/j.ijbiomac.2019.12.009. Epub 2019 Dec 2.
3
i4mC-EL: Identifying DNA N4-Methylcytosine Sites in the Mouse Genome Using Ensemble Learning.i4mC-EL:利用集成学习鉴定小鼠基因组中的 DNA N4-甲基胞嘧啶位点。
Biomed Res Int. 2021 May 29;2021:5515342. doi: 10.1155/2021/5515342. eCollection 2021.
4
Computational identification of N4-methylcytosine sites in the mouse genome with machine-learning method.利用机器学习方法对小鼠基因组中N4-甲基胞嘧啶位点进行计算识别。
Math Biosci Eng. 2021 Apr 15;18(4):3348-3363. doi: 10.3934/mbe.2021167.
5
4mCpred-EL: An Ensemble Learning Framework for Identification of DNA -methylcytosine Sites in the Mouse Genome.4mCpred-EL:用于鉴定小鼠基因组中 DNA-甲基胞嘧啶位点的集成学习框架。
Cells. 2019 Oct 28;8(11):1332. doi: 10.3390/cells8111332.
6
i6mA-Fuse: improved and robust prediction of DNA 6 mA sites in the Rosaceae genome by fusing multiple feature representation.i6mA-Fuse:通过融合多种特征表示来改进和增强蔷薇科基因组中 DNA 6mA 位点的预测
Plant Mol Biol. 2020 May;103(1-2):225-234. doi: 10.1007/s11103-020-00988-y. Epub 2020 Mar 5.
7
Meta-4mCpred: A Sequence-Based Meta-Predictor for Accurate DNA 4mC Site Prediction Using Effective Feature Representation.Meta-4mCpred:一种基于序列的元预测器,用于通过有效特征表示准确预测DNA 4mC位点。
Mol Ther Nucleic Acids. 2019 Jun 7;16:733-744. doi: 10.1016/j.omtn.2019.04.019. Epub 2019 Apr 30.
8
4mCPred-CNN-Prediction of DNA N4-Methylcytosine in the Mouse Genome Using a Convolutional Neural Network.4mCPred-CNN-使用卷积神经网络预测小鼠基因组中的 DNA N4-甲基胞嘧啶。
Genes (Basel). 2021 Feb 20;12(2):296. doi: 10.3390/genes12020296.
9
Iterative feature representations improve N4-methylcytosine site prediction.迭代特征表示可提高 N4-甲基胞嘧啶位点预测的准确性。
Bioinformatics. 2019 Dec 1;35(23):4930-4937. doi: 10.1093/bioinformatics/btz408.
10
Exploring sequence-based features for the improved prediction of DNA N4-methylcytosine sites in multiple species.探索基于序列的特征,以提高在多个物种中预测 DNA N4-甲基胞嘧啶位点的能力。
Bioinformatics. 2019 Apr 15;35(8):1326-1333. doi: 10.1093/bioinformatics/bty824.

引用本文的文献

1
Alternative splicing dynamics during gastrulation in mouse embryo.小鼠胚胎原肠胚形成过程中的可变剪接动态
Sci Rep. 2025 Mar 31;15(1):10948. doi: 10.1038/s41598-025-96148-7.
2
STM-ac4C: a hybrid model for identification of N4-acetylcytidine (ac4C) in human mRNA based on selective kernel convolution, temporal convolutional network, and multi-head self-attention.STM-ac4C:一种基于选择性核卷积、时间卷积网络和多头自注意力机制的用于识别人类mRNA中N4-乙酰胞嘧啶(ac4C)的混合模型。
Front Genet. 2024 May 30;15:1408688. doi: 10.3389/fgene.2024.1408688. eCollection 2024.
3
Comparative evaluation and analysis of DNA N4-methylcytosine methylation sites using deep learning.

本文引用的文献

1
iDNA-MS: An Integrated Computational Tool for Detecting DNA Modification Sites in Multiple Genomes.iDNA-MS:一种用于检测多个基因组中DNA修饰位点的综合计算工具。
iScience. 2020 Apr 24;23(4):100991. doi: 10.1016/j.isci.2020.100991. Epub 2020 Mar 19.
2
iBitter-SCM: Identification and characterization of bitter peptides using a scoring card method with propensity scores of dipeptides.iBitter-SCM:利用二肽倾向评分的评分卡方法鉴定和表征苦味肽。
Genomics. 2020 Jul;112(4):2813-2822. doi: 10.1016/j.ygeno.2020.03.019. Epub 2020 Mar 28.
3
A Deep Neural Network for Identifying DNA N4-Methylcytosine Sites.
利用深度学习对DNA N4-甲基胞嘧啶甲基化位点进行比较评估与分析
Front Genet. 2023 Aug 21;14:1254827. doi: 10.3389/fgene.2023.1254827. eCollection 2023.
4
Prediction of DNA Methylation based on Multi-dimensional feature encoding and double convolutional fully connected convolutional neural network.基于多维特征编码和双卷积全连接卷积神经网络的 DNA 甲基化预测。
PLoS Comput Biol. 2023 Aug 28;19(8):e1011370. doi: 10.1371/journal.pcbi.1011370. eCollection 2023 Aug.
5
MuLan-Methyl-multiple transformer-based language models for accurate DNA methylation prediction.木兰-甲基-多变压器语言模型,用于准确预测 DNA 甲基化。
Gigascience. 2022 Dec 28;12. doi: 10.1093/gigascience/giad054. Epub 2023 Jul 25.
6
i4mC-GRU: Identifying DNA N-Methylcytosine sites in mouse genomes using bidirectional gated recurrent unit and sequence-embedded features.i4mC-GRU:利用双向门控循环单元和序列嵌入特征识别小鼠基因组中的DNA N-甲基胞嘧啶位点。
Comput Struct Biotechnol J. 2023 May 16;21:3045-3053. doi: 10.1016/j.csbj.2023.05.014. eCollection 2023.
7
m6Aminer: Predicting the m6Am Sites on mRNA by Fusing Multiple Sequence-Derived Features into a CatBoost-Based Classifier.m6Aminer:通过将多种序列衍生特征融合到基于 CatBoost 的分类器中,预测 mRNA 上的 m6A 位点。
Int J Mol Sci. 2023 Apr 26;24(9):7878. doi: 10.3390/ijms24097878.
8
Empirical comparison and recent advances of computational prediction of hormone binding proteins using machine learning methods.使用机器学习方法对激素结合蛋白进行计算预测的实证比较与最新进展
Comput Struct Biotechnol J. 2023 Mar 17;21:2253-2261. doi: 10.1016/j.csbj.2023.03.024. eCollection 2023.
9
A Grid Search-Based Multilayer Dynamic Ensemble System to Identify DNA N4-Methylcytosine Using Deep Learning Approach.基于网格搜索的多层动态集成系统,利用深度学习方法识别 DNA N4-甲基胞嘧啶。
Genes (Basel). 2023 Feb 25;14(3):582. doi: 10.3390/genes14030582.
10
MultiScale-CNN-4mCPred: a multi-scale CNN and adaptive embedding-based method for mouse genome DNA N4-methylcytosine prediction.多尺度 CNN-4mCPred:一种基于多尺度 CNN 和自适应嵌入的方法,用于预测小鼠基因组 DNA N4-甲基胞嘧啶。
BMC Bioinformatics. 2023 Jan 18;24(1):21. doi: 10.1186/s12859-023-05135-0.
用于识别DNA N4-甲基胞嘧啶位点的深度神经网络
Front Genet. 2020 Mar 6;11:209. doi: 10.3389/fgene.2020.00209. eCollection 2020.
4
HLPpred-Fuse: improved and robust prediction of hemolytic peptide and its activity by fusing multiple feature representation.HLPpred-Fuse:通过融合多种特征表示提高和增强溶血肽及其活性的预测
Bioinformatics. 2020 Jun 1;36(11):3350-3356. doi: 10.1093/bioinformatics/btaa160.
5
i6mA-Fuse: improved and robust prediction of DNA 6 mA sites in the Rosaceae genome by fusing multiple feature representation.i6mA-Fuse:通过融合多种特征表示来改进和增强蔷薇科基因组中 DNA 6mA 位点的预测
Plant Mol Biol. 2020 May;103(1-2):225-234. doi: 10.1007/s11103-020-00988-y. Epub 2020 Mar 5.
6
Machine intelligence in peptide therapeutics: A next-generation tool for rapid disease screening.机器智能在肽类药物治疗学中的应用:一种用于快速疾病筛查的下一代工具。
Med Res Rev. 2020 Jul;40(4):1276-1314. doi: 10.1002/med.21658. Epub 2020 Jan 10.
7
i4mC-ROSE, a bioinformatics tool for the identification of DNA N4-methylcytosine sites in the Rosaceae genome.i4mC-ROSE,一种用于鉴定蔷薇科基因组中 DNA N4-甲基胞嘧啶位点的生物信息学工具。
Int J Biol Macromol. 2020 Aug 15;157:752-758. doi: 10.1016/j.ijbiomac.2019.12.009. Epub 2019 Dec 2.
8
Prediction of S-nitrosylation sites by integrating support vector machines and random forest.基于支持向量机和随机森林算法的 S-亚硝酰化位点预测
Mol Omics. 2019 Dec 2;15(6):451-458. doi: 10.1039/c9mo00098d.
9
4mCpred-EL: An Ensemble Learning Framework for Identification of DNA -methylcytosine Sites in the Mouse Genome.4mCpred-EL:用于鉴定小鼠基因组中 DNA-甲基胞嘧啶位点的集成学习框架。
Cells. 2019 Oct 28;8(11):1332. doi: 10.3390/cells8111332.
10
A comparison and assessment of computational method for identifying recombination hotspots in Saccharomyces cerevisiae.一种比较和评估鉴定酿酒酵母重组热点的计算方法。
Brief Bioinform. 2020 Sep 25;21(5):1568-1580. doi: 10.1093/bib/bbz123.