• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Deep6mA:一个用于探索不同物种中 DNA N6-甲基腺嘌呤位点相似模式的深度学习框架。

Deep6mA: A deep learning framework for exploring similar patterns in DNA N6-methyladenine sites across different species.

机构信息

Department of Mathematics, College of Science, Nanjing Agricultural University, Nanjing, China.

Center for Data Science, Zhejiang University, Hangzhou, China.

出版信息

PLoS Comput Biol. 2021 Feb 18;17(2):e1008767. doi: 10.1371/journal.pcbi.1008767. eCollection 2021 Feb.

DOI:10.1371/journal.pcbi.1008767
PMID:33600435
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7924747/
Abstract

N6-methyladenine (6mA) is an important DNA modification form associated with a wide range of biological processes. Identifying accurately 6mA sites on a genomic scale is crucial for under-standing of 6mA's biological functions. However, the existing experimental techniques for detecting 6mA sites are cost-ineffective, which implies the great need of developing new computational methods for this problem. In this paper, we developed, without requiring any prior knowledge of 6mA and manually crafted sequence features, a deep learning framework named Deep6mA to identify DNA 6mA sites, and its performance is superior to other DNA 6mA prediction tools. Specifically, the 5-fold cross-validation on a benchmark dataset of rice gives the sensitivity and specificity of Deep6mA as 92.96% and 95.06%, respectively, and the overall prediction accuracy is 94%. Importantly, we find that the sequences with 6mA sites share similar patterns across different species. The model trained with rice data predicts well the 6mA sites of other three species: Arabidopsis thaliana, Fragaria vesca and Rosa chinensis with a prediction accuracy over 90%. In addition, we find that (1) 6mA tends to occur at GAGG motifs, which means the sequence near the 6mA site may be conservative; (2) 6mA is enriched in the TATA box of the promoter, which may be the main source of its regulating downstream gene expression.

摘要

N6-甲基腺嘌呤(6mA)是一种与广泛的生物学过程相关的重要 DNA 修饰形式。准确识别基因组范围内的 6mA 位点对于理解 6mA 的生物学功能至关重要。然而,现有的检测 6mA 位点的实验技术成本效益不高,这意味着非常需要开发新的计算方法来解决这个问题。在本文中,我们开发了一种名为 Deep6mA 的深度学习框架,无需事先了解 6mA 并人工制作序列特征,即可识别 DNA 6mA 位点,其性能优于其他 DNA 6mA 预测工具。具体来说,在水稻基准数据集上进行的 5 倍交叉验证,Deep6mA 的灵敏度和特异性分别为 92.96%和 95.06%,整体预测准确率为 94%。重要的是,我们发现具有 6mA 位点的序列在不同物种之间具有相似的模式。使用水稻数据训练的模型可以很好地预测其他三个物种(拟南芥、野草莓和月季)的 6mA 位点,预测准确率超过 90%。此外,我们还发现:(1)6mA 倾向于出现在 GAGG 基序中,这意味着 6mA 位点附近的序列可能具有保守性;(2)6mA 在启动子的 TATA 盒中富集,这可能是其调节下游基因表达的主要来源。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ce0/7924747/a1505d9a8853/pcbi.1008767.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ce0/7924747/732efadb6b20/pcbi.1008767.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ce0/7924747/961ea5c844e7/pcbi.1008767.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ce0/7924747/7728f2ae732f/pcbi.1008767.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ce0/7924747/8dde4fd0b679/pcbi.1008767.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ce0/7924747/0b733fcde183/pcbi.1008767.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ce0/7924747/4b40e05be640/pcbi.1008767.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ce0/7924747/a1505d9a8853/pcbi.1008767.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ce0/7924747/732efadb6b20/pcbi.1008767.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ce0/7924747/961ea5c844e7/pcbi.1008767.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ce0/7924747/7728f2ae732f/pcbi.1008767.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ce0/7924747/8dde4fd0b679/pcbi.1008767.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ce0/7924747/0b733fcde183/pcbi.1008767.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ce0/7924747/4b40e05be640/pcbi.1008767.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ce0/7924747/a1505d9a8853/pcbi.1008767.g007.jpg

相似文献

1
Deep6mA: A deep learning framework for exploring similar patterns in DNA N6-methyladenine sites across different species.Deep6mA:一个用于探索不同物种中 DNA N6-甲基腺嘌呤位点相似模式的深度学习框架。
PLoS Comput Biol. 2021 Feb 18;17(2):e1008767. doi: 10.1371/journal.pcbi.1008767. eCollection 2021 Feb.
2
Multi-scale DNA language model improves 6 mA binding sites prediction.多尺度 DNA 语言模型提高了 6mA 结合位点的预测。
Comput Biol Chem. 2024 Oct;112:108129. doi: 10.1016/j.compbiolchem.2024.108129. Epub 2024 Jul 18.
3
i6mA-DNCP: Computational Identification of DNA -Methyladenine Sites in the Rice Genome Using Optimized Dinucleotide-Based Features.i6mA-DNCP:利用优化的二核苷酸特征计算鉴定水稻基因组中的 DNA-甲基腺嘌呤位点。
Genes (Basel). 2019 Oct 20;10(10):828. doi: 10.3390/genes10100828.
4
ENet-6mA: Identification of 6mA Modification Sites in Plant Genomes Using ElasticNet and Neural Networks.ENet-6mA:使用弹性网络和神经网络鉴定植物基因组中的 6mA 修饰位点。
Int J Mol Sci. 2022 Jul 27;23(15):8314. doi: 10.3390/ijms23158314.
5
i6mA-stack: A stacking ensemble-based computational prediction of DNA N6-methyladenine (6mA) sites in the Rosaceae genome.i6mA-stack:基于堆叠集成法对蔷薇科基因组中DNA N6-甲基腺嘌呤(6mA)位点的计算预测。
Genomics. 2021 Jan;113(1 Pt 2):582-592. doi: 10.1016/j.ygeno.2020.09.054. Epub 2020 Oct 1.
6
SNNRice6mA: A Deep Learning Method for Predicting DNA N6-Methyladenine Sites in Rice Genome.SNNRice6mA:一种预测水稻基因组中DNA N6-甲基腺嘌呤位点的深度学习方法。
Front Genet. 2019 Oct 11;10:1071. doi: 10.3389/fgene.2019.01071. eCollection 2019.
7
Meta-i6mA: an interspecies predictor for identifying DNA N6-methyladenine sites of plant genomes by exploiting informative features in an integrative machine-learning framework.Meta-i6mA:利用集成机器学习框架中的信息特征,用于识别植物基因组中 DNA N6-甲基腺嘌呤位点的种间预测因子。
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa202.
8
GC6mA-Pred: A deep learning approach to identify DNA N6-methyladenine sites in the rice genome.GC6mA-Pred:一种用于鉴定水稻基因组中 DNA N6-甲基腺嘌呤位点的深度学习方法。
Methods. 2022 Aug;204:14-21. doi: 10.1016/j.ymeth.2022.02.001. Epub 2022 Feb 9.
9
MGF6mARice: prediction of DNA N6-methyladenine sites in rice by exploiting molecular graph feature and residual block.MGF6mARice:利用分子图特征和残差块预测水稻中的 DNA N6-甲基腺嘌呤位点。
Brief Bioinform. 2022 May 13;23(3). doi: 10.1093/bib/bbac082.
10
A convolution based computational approach towards DNA N6-methyladenine site identification and motif extraction in rice genome.基于卷积的计算方法,用于鉴定水稻基因组中的 DNA N6-甲基腺嘌呤位点并提取其基序。
Sci Rep. 2021 May 14;11(1):10357. doi: 10.1038/s41598-021-89850-9.

引用本文的文献

1
DNA Methylation Recognition Using Hybrid Deep Learning with Dual Nucleotide Visualization Fusion Feature Encoding.基于双核苷酸可视化融合特征编码的混合深度学习DNA甲基化识别
Interdiscip Sci. 2025 Jul 16. doi: 10.1007/s12539-025-00737-z.
2
Identification of DNA N6-methyladenine modifications in the rice genome with a fine-tuned large language model.利用微调的大语言模型鉴定水稻基因组中的DNA N6-甲基腺嘌呤修饰
Front Plant Sci. 2025 Jun 25;16:1626539. doi: 10.3389/fpls.2025.1626539. eCollection 2025.
3
DNA sequence analysis landscape: a comprehensive review of DNA sequence analysis task types, databases, datasets, word embedding methods, and language models.

本文引用的文献

1
i6mA-Fuse: improved and robust prediction of DNA 6 mA sites in the Rosaceae genome by fusing multiple feature representation.i6mA-Fuse:通过融合多种特征表示来改进和增强蔷薇科基因组中 DNA 6mA 位点的预测
Plant Mol Biol. 2020 May;103(1-2):225-234. doi: 10.1007/s11103-020-00988-y. Epub 2020 Mar 5.
2
SNNRice6mA: A Deep Learning Method for Predicting DNA N6-Methyladenine Sites in Rice Genome.SNNRice6mA:一种预测水稻基因组中DNA N6-甲基腺嘌呤位点的深度学习方法。
Front Genet. 2019 Oct 11;10:1071. doi: 10.3389/fgene.2019.01071. eCollection 2019.
3
i6mA-DNCP: Computational Identification of DNA -Methyladenine Sites in the Rice Genome Using Optimized Dinucleotide-Based Features.
DNA序列分析全景:对DNA序列分析任务类型、数据库、数据集、词嵌入方法和语言模型的全面综述。
Front Med (Lausanne). 2025 Apr 8;12:1503229. doi: 10.3389/fmed.2025.1503229. eCollection 2025.
4
N6-methyladenine identification using deep learning and discriminative feature integration.利用深度学习和判别特征整合进行N6-甲基腺嘌呤识别
BMC Med Genomics. 2025 Mar 29;18(1):58. doi: 10.1186/s12920-025-02131-6.
5
Methyl-GP: accurate generic DNA methylation prediction based on a language model and representation learning.甲基化基因组图谱(Methyl-GP):基于语言模型和表征学习的准确通用DNA甲基化预测
Nucleic Acids Res. 2025 Mar 20;53(6). doi: 10.1093/nar/gkaf223.
6
Deep learning modeling of RNA ac4C deposition reveals the importance of plant alternative splicing.RNA ac4C 沉积的深度学习建模揭示了植物可变剪接的重要性。
Plant Mol Biol. 2024 Oct 28;114(6):118. doi: 10.1007/s11103-024-01512-2.
7
Biological Sequence Classification: A Review on Data and General Methods.生物序列分类:数据与通用方法综述
Research (Wash D C). 2022 Dec 19;2022:0011. doi: 10.34133/research.0011. eCollection 2022.
8
Time series-based hybrid ensemble learning model with multivariate multidimensional feature coding for DNA methylation prediction.基于时间序列的混合集成学习模型,具有多维多维特征编码,用于 DNA 甲基化预测。
BMC Genomics. 2023 Dec 11;24(1):758. doi: 10.1186/s12864-023-09866-5.
9
StableDNAm: towards a stable and efficient model for predicting DNA methylation based on adaptive feature correction learning.StableDNAm:一种基于自适应特征校正学习的稳定且高效的 DNA 甲基化预测模型。
BMC Genomics. 2023 Dec 5;24(1):742. doi: 10.1186/s12864-023-09802-7.
10
6mA-StackingCV: an improved stacking ensemble model for predicting DNA N6-methyladenine site.6mA-StackingCV:一种用于预测DNA N6-甲基腺嘌呤位点的改进堆叠集成模型。
BioData Min. 2023 Nov 27;16(1):34. doi: 10.1186/s13040-023-00348-8.
i6mA-DNCP:利用优化的二核苷酸特征计算鉴定水稻基因组中的 DNA-甲基腺嘌呤位点。
Genes (Basel). 2019 Oct 20;10(10):828. doi: 10.3390/genes10100828.
4
iDNA6mA-Rice: A Computational Tool for Detecting N6-Methyladenine Sites in Rice.iDNA6mA-水稻:一种用于检测水稻中N6-甲基腺嘌呤位点的计算工具。
Front Genet. 2019 Sep 10;10:793. doi: 10.3389/fgene.2019.00793. eCollection 2019.
5
SDM6A: A Web-Based Integrative Machine-Learning Framework for Predicting 6mA Sites in the Rice Genome.SDM6A:一个基于网络的用于预测水稻基因组中6mA位点的综合机器学习框架。
Mol Ther Nucleic Acids. 2019 Dec 6;18:131-141. doi: 10.1016/j.omtn.2019.08.011. Epub 2019 Aug 16.
6
csDMA: an improved bioinformatics tool for identifying DNA 6 mA modifications via Chou's 5-step rule.csDMA:一种通过周教授五步法则识别 DNA 6mA 修饰的改进型生物信息学工具。
Sci Rep. 2019 Sep 11;9(1):13109. doi: 10.1038/s41598-019-49430-4.
7
MM-6mAPred: identifying DNA N6-methyladenine sites based on Markov model.MM-6mAPred:基于马尔可夫模型识别 DNA N6-甲基腺嘌呤位点。
Bioinformatics. 2020 Jan 15;36(2):388-392. doi: 10.1093/bioinformatics/btz556.
8
MDR: an integrative DNA N6-methyladenine and N4-methylcytosine modification database for Rosaceae.MDR:一个用于蔷薇科的整合DNA N6-甲基腺嘌呤和N4-甲基胞嘧啶修饰数据库。
Hortic Res. 2019 Jun 15;6:78. doi: 10.1038/s41438-019-0160-4. eCollection 2019.
9
i6mA-Pred: identifying DNA N6-methyladenine sites in the rice genome.i6mA-Pred:鉴定水稻基因组中的 DNA N6-甲基腺嘌呤位点。
Bioinformatics. 2019 Aug 15;35(16):2796-2800. doi: 10.1093/bioinformatics/btz015.
10
Identification and analysis of adenine N-methylation sites in the rice genome.鉴定和分析水稻基因组中的腺嘌呤 N-甲基化位点。
Nat Plants. 2018 Aug;4(8):554-563. doi: 10.1038/s41477-018-0214-x. Epub 2018 Jul 30.