• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

i6mA-Vote:基于投票集成学习的植物基因组中DNA N6-甲基腺嘌呤位点的跨物种鉴定

i6mA-Vote: Cross-Species Identification of DNA N6-Methyladenine Sites in Plant Genomes Based on Ensemble Learning With Voting.

作者信息

Teng Zhixia, Zhao Zhengnan, Li Yanjuan, Tian Zhen, Guo Maozu, Lu Qianzi, Wang Guohua

机构信息

College of Information and Computer Engineering, Northeast Forestry University, Harbin, China.

College of Electrical and Information Engineering, Quzhou University, Quzhou, China.

出版信息

Front Plant Sci. 2022 Feb 14;13:845835. doi: 10.3389/fpls.2022.845835. eCollection 2022.

DOI:10.3389/fpls.2022.845835
PMID:35237293
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8882731/
Abstract

DNA N6-Methyladenine (6mA) is a common epigenetic modification, which plays some significant roles in the growth and development of plants. It is crucial to identify 6mA sites for elucidating the functions of 6mA. In this article, a novel model named i6mA-vote is developed to predict 6mA sites of plants. Firstly, DNA sequences were coded into six feature vectors with diverse strategies based on density, physicochemical properties, and position of nucleotides, respectively. To find the best coding strategy, the feature vectors were compared on several machine learning classifiers. The results suggested that the position of nucleotides has a significant positive effect on 6mA sites identification. Thus, the dinucleotide one-hot strategy which can describe position characteristics of nucleotides well was employed to extract DNA features in our method. Secondly, DNA sequences of Rosaceae were divided into a training dataset and a test dataset randomly. Finally, i6mA-vote was constructed by combining five different base-classifiers under a majority voting strategy and trained on the Rosaceae training dataset. The i6mA-vote was evaluated on the task of predicting 6mA sites from the genome of the Rosaceae, Rice, and Arabidopsis separately. In Rosaceae, the performances of i6mA-vote were 0.955 on accuracy (ACC), 0.909 on Matthew correlation coefficients (MCC), 0.955 on sensitivity (SN), and 0.954 on specificity (SP). Those indicators, in the order of ACC, MCC, SN, SP, were 0.882, 0.774, 0.961, and 0.803 on Rice while they were 0.798, 0.617, 0.666, and 0.929 on Arabidopsis. According to the indicators, our method was effectiveness and better than other concerned methods. The results also illustrated that i6mA-vote does not only well in 6mA sites prediction of intraspecies but also interspecies plants. Moreover, it can be seen that the specificity is distinctly lower than the sensitivity in Rice while it is just the opposite in Arabidopsis. It may be resulted from sequence similarity among Rosaceae, Rice and Arabidopsis.

摘要

DNA N6-甲基腺嘌呤(6mA)是一种常见的表观遗传修饰,在植物的生长发育中发挥着重要作用。识别6mA位点对于阐明6mA的功能至关重要。在本文中,开发了一种名为i6mA-vote的新模型来预测植物的6mA位点。首先,DNA序列分别基于密度、物理化学性质和核苷酸位置,采用不同策略编码为六个特征向量。为了找到最佳编码策略,在几个机器学习分类器上对特征向量进行了比较。结果表明,核苷酸位置对6mA位点识别有显著的正向影响。因此,在我们的方法中采用了能很好描述核苷酸位置特征的二核苷酸独热策略来提取DNA特征。其次,蔷薇科植物的DNA序列被随机分为训练数据集和测试数据集。最后,通过在多数投票策略下组合五个不同的基分类器构建了i6mA-vote,并在蔷薇科训练数据集上进行训练。分别在从蔷薇科、水稻和拟南芥基因组预测6mA位点的任务中对i6mA-vote进行了评估。在蔷薇科中,i6mA-vote的准确率(ACC)为0.955,马修相关系数(MCC)为0.909,灵敏度(SN)为0.955,特异性(SP)为0.954。在水稻上,这些指标按ACC、MCC、SN、SP的顺序分别为0.882、0.774、0.961和0.803,而在拟南芥上分别为0.798、0.617、0.666和0.929。根据这些指标,我们的方法是有效的,并且优于其他相关方法。结果还表明,i6mA-vote不仅在种内6mA位点预测中表现良好,在种间植物中也表现良好。此外,可以看出,在水稻中特异性明显低于灵敏度,而在拟南芥中则相反。这可能是由于蔷薇科、水稻和拟南芥之间的序列相似性导致的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5f3e/8882731/6380d76ab28f/fpls-13-845835-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5f3e/8882731/b9d2cb1e8873/fpls-13-845835-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5f3e/8882731/3d0cd2f269f1/fpls-13-845835-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5f3e/8882731/d9ce2a92d597/fpls-13-845835-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5f3e/8882731/214051154f41/fpls-13-845835-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5f3e/8882731/6380d76ab28f/fpls-13-845835-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5f3e/8882731/b9d2cb1e8873/fpls-13-845835-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5f3e/8882731/3d0cd2f269f1/fpls-13-845835-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5f3e/8882731/d9ce2a92d597/fpls-13-845835-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5f3e/8882731/214051154f41/fpls-13-845835-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5f3e/8882731/6380d76ab28f/fpls-13-845835-g005.jpg

相似文献

1
i6mA-Vote: Cross-Species Identification of DNA N6-Methyladenine Sites in Plant Genomes Based on Ensemble Learning With Voting.i6mA-Vote:基于投票集成学习的植物基因组中DNA N6-甲基腺嘌呤位点的跨物种鉴定
Front Plant Sci. 2022 Feb 14;13:845835. doi: 10.3389/fpls.2022.845835. eCollection 2022.
2
Meta-i6mA: an interspecies predictor for identifying DNA N6-methyladenine sites of plant genomes by exploiting informative features in an integrative machine-learning framework.Meta-i6mA:利用集成机器学习框架中的信息特征,用于识别植物基因组中 DNA N6-甲基腺嘌呤位点的种间预测因子。
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa202.
3
i6mA-stack: A stacking ensemble-based computational prediction of DNA N6-methyladenine (6mA) sites in the Rosaceae genome.i6mA-stack:基于堆叠集成法对蔷薇科基因组中DNA N6-甲基腺嘌呤(6mA)位点的计算预测。
Genomics. 2021 Jan;113(1 Pt 2):582-592. doi: 10.1016/j.ygeno.2020.09.054. Epub 2020 Oct 1.
4
Ense-i6mA: Identification of DNA N-Methyladenine Sites Using XGB-RFE Feature Selection and Ensemble Machine Learning.Ense-i6mA:使用XGB-RFE特征选择和集成机器学习识别DNA N-甲基腺嘌呤位点
IEEE/ACM Trans Comput Biol Bioinform. 2024 Nov-Dec;21(6):1842-1854. doi: 10.1109/TCBB.2024.3421228. Epub 2024 Dec 10.
5
i6mA-DNCP: Computational Identification of DNA -Methyladenine Sites in the Rice Genome Using Optimized Dinucleotide-Based Features.i6mA-DNCP:利用优化的二核苷酸特征计算鉴定水稻基因组中的 DNA-甲基腺嘌呤位点。
Genes (Basel). 2019 Oct 20;10(10):828. doi: 10.3390/genes10100828.
6
i6mA-Pred: identifying DNA N6-methyladenine sites in the rice genome.i6mA-Pred:鉴定水稻基因组中的 DNA N6-甲基腺嘌呤位点。
Bioinformatics. 2019 Aug 15;35(16):2796-2800. doi: 10.1093/bioinformatics/btz015.
7
i6mA-Caps: a CapsuleNet-based framework for identifying DNA N6-methyladenine sites.i6mA-Caps:一种基于胶囊网络的 DNA N6-甲基腺嘌呤位点识别框架。
Bioinformatics. 2022 Aug 10;38(16):3885-3891. doi: 10.1093/bioinformatics/btac434.
8
i6mA-VC: A Multi-Classifier Voting Method for the Computational Identification of DNA N6-methyladenine Sites.i6mA-VC:一种用于计算鉴定 DNA N6-甲基腺嘌呤位点的多分类器投票方法。
Interdiscip Sci. 2021 Sep;13(3):413-425. doi: 10.1007/s12539-021-00429-4. Epub 2021 Apr 8.
9
i6mA-Fuse: improved and robust prediction of DNA 6 mA sites in the Rosaceae genome by fusing multiple feature representation.i6mA-Fuse:通过融合多种特征表示来改进和增强蔷薇科基因组中 DNA 6mA 位点的预测
Plant Mol Biol. 2020 May;103(1-2):225-234. doi: 10.1007/s11103-020-00988-y. Epub 2020 Mar 5.
10
iDNA6mA-Rice: A Computational Tool for Detecting N6-Methyladenine Sites in Rice.iDNA6mA-水稻:一种用于检测水稻中N6-甲基腺嘌呤位点的计算工具。
Front Genet. 2019 Sep 10;10:793. doi: 10.3389/fgene.2019.00793. eCollection 2019.

引用本文的文献

1
Identification of DNA N6-methyladenine modifications in the rice genome with a fine-tuned large language model.利用微调的大语言模型鉴定水稻基因组中的DNA N6-甲基腺嘌呤修饰
Front Plant Sci. 2025 Jun 25;16:1626539. doi: 10.3389/fpls.2025.1626539. eCollection 2025.
2
HD-6mAPred: a hybrid deep learning approach for accurate prediction of N6-methyladenine sites in plant species.HD-6mAPred:一种用于准确预测植物物种中N6-甲基腺嘌呤位点的混合深度学习方法。
PeerJ. 2025 May 15;13:e19463. doi: 10.7717/peerj.19463. eCollection 2025.
3
A robust deep learning approach for identification of RNA 5-methyluridine sites.

本文引用的文献

1
iBLP: An XGBoost-Based Predictor for Identifying Bioluminescent Proteins.iBLP:一种基于 XGBoost 的生物发光蛋白鉴定预测器。
Comput Math Methods Med. 2021 Jan 7;2021:6664362. doi: 10.1155/2021/6664362. eCollection 2021.
2
i6mA-stack: A stacking ensemble-based computational prediction of DNA N6-methyladenine (6mA) sites in the Rosaceae genome.i6mA-stack:基于堆叠集成法对蔷薇科基因组中DNA N6-甲基腺嘌呤(6mA)位点的计算预测。
Genomics. 2021 Jan;113(1 Pt 2):582-592. doi: 10.1016/j.ygeno.2020.09.054. Epub 2020 Oct 1.
3
Meta-i6mA: an interspecies predictor for identifying DNA N6-methyladenine sites of plant genomes by exploiting informative features in an integrative machine-learning framework.
一种用于鉴定 RNA 5-甲基尿嘧啶位点的稳健深度学习方法。
Sci Rep. 2024 Oct 28;14(1):25688. doi: 10.1038/s41598-024-76148-9.
4
DeepPGD: A Deep Learning Model for DNA Methylation Prediction Using Temporal Convolution, BiLSTM, and Attention Mechanism.深度 PG-D:一种基于时间卷积、BiLSTM 和注意力机制的 DNA 甲基化深度学习预测模型。
Int J Mol Sci. 2024 Jul 26;25(15):8146. doi: 10.3390/ijms25158146.
5
6mA-StackingCV: an improved stacking ensemble model for predicting DNA N6-methyladenine site.6mA-StackingCV:一种用于预测DNA N6-甲基腺嘌呤位点的改进堆叠集成模型。
BioData Min. 2023 Nov 27;16(1):34. doi: 10.1186/s13040-023-00348-8.
6
Recall DNA methylation levels at low coverage sites using a CNN model in WGBS.使用 CNN 模型在 WGBS 中召回低覆盖位点的 DNA 甲基化水平。
PLoS Comput Biol. 2023 Jun 14;19(6):e1011205. doi: 10.1371/journal.pcbi.1011205. eCollection 2023 Jun.
7
Computational prediction of promotors in strain C58 by using the machine learning technique.利用机器学习技术对C58菌株中的启动子进行计算预测。
Front Microbiol. 2023 Apr 13;14:1170785. doi: 10.3389/fmicb.2023.1170785. eCollection 2023.
8
iIL13Pred: improved prediction of IL-13 inducing peptides using popular machine learning classifiers.iIL13Pred:使用流行的机器学习分类器改进 IL-13 诱导肽的预测。
BMC Bioinformatics. 2023 Apr 11;24(1):141. doi: 10.1186/s12859-023-05248-6.
Meta-i6mA:利用集成机器学习框架中的信息特征,用于识别植物基因组中 DNA N6-甲基腺嘌呤位点的种间预测因子。
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa202.
4
ProtFold-DFG: protein fold recognition by combining Directed Fusion Graph and PageRank algorithm.ProtFold-DFG:通过结合定向融合图和 PageRank 算法进行蛋白质折叠识别。
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa192.
5
i6mA-Fuse: improved and robust prediction of DNA 6 mA sites in the Rosaceae genome by fusing multiple feature representation.i6mA-Fuse:通过融合多种特征表示来改进和增强蔷薇科基因组中 DNA 6mA 位点的预测
Plant Mol Biol. 2020 May;103(1-2):225-234. doi: 10.1007/s11103-020-00988-y. Epub 2020 Mar 5.
6
6mA-Finder: a novel online tool for predicting DNA N6-methyladenine sites in genomes.6mA-Finder:一种用于预测基因组中 DNA N6-甲基腺嘌呤位点的新型在线工具。
Bioinformatics. 2020 May 1;36(10):3257-3259. doi: 10.1093/bioinformatics/btaa113.
7
SNNRice6mA: A Deep Learning Method for Predicting DNA N6-Methyladenine Sites in Rice Genome.SNNRice6mA:一种预测水稻基因组中DNA N6-甲基腺嘌呤位点的深度学习方法。
Front Genet. 2019 Oct 11;10:1071. doi: 10.3389/fgene.2019.01071. eCollection 2019.
8
i6mA-DNCP: Computational Identification of DNA -Methyladenine Sites in the Rice Genome Using Optimized Dinucleotide-Based Features.i6mA-DNCP:利用优化的二核苷酸特征计算鉴定水稻基因组中的 DNA-甲基腺嘌呤位点。
Genes (Basel). 2019 Oct 20;10(10):828. doi: 10.3390/genes10100828.
9
iDNA6mA-Rice: A Computational Tool for Detecting N6-Methyladenine Sites in Rice.iDNA6mA-水稻:一种用于检测水稻中N6-甲基腺嘌呤位点的计算工具。
Front Genet. 2019 Sep 10;10:793. doi: 10.3389/fgene.2019.00793. eCollection 2019.
10
MM-6mAPred: identifying DNA N6-methyladenine sites based on Markov model.MM-6mAPred:基于马尔可夫模型识别 DNA N6-甲基腺嘌呤位点。
Bioinformatics. 2020 Jan 15;36(2):388-392. doi: 10.1093/bioinformatics/btz556.