• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

iDNA-OpenPrompt:用于识别DNA甲基化的OpenPrompt学习模型。

iDNA-OpenPrompt: OpenPrompt learning model for identifying DNA methylation.

作者信息

Yu Xia, Ren Jia, Long Haixia, Zeng Rao, Zhang Guoqiang, Bilal Anas, Cui Yani

机构信息

School of Information and Communication Engineering, Hainan University, Haikou, Hainan, China.

School of Information Science and Technology, Hainan Normal University, Haikou, Hainan, China.

出版信息

Front Genet. 2024 Apr 16;15:1377285. doi: 10.3389/fgene.2024.1377285. eCollection 2024.

DOI:10.3389/fgene.2024.1377285
PMID:38689652
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11058834/
Abstract

DNA methylation is a critical epigenetic modification involving the addition of a methyl group to the DNA molecule, playing a key role in regulating gene expression without changing the DNA sequence. The main difficulty in identifying DNA methylation sites lies in the subtle and complex nature of methylation patterns, which may vary across different tissues, developmental stages, and environmental conditions. Traditional methods for methylation site identification, such as bisulfite sequencing, are typically labor-intensive, costly, and require large amounts of DNA, hindering high-throughput analysis. Moreover, these methods may not always provide the resolution needed to detect methylation at specific sites, especially in genomic regions that are rich in repetitive sequences or have low levels of methylation. Furthermore, current deep learning approaches generally lack sufficient accuracy. This study introduces the iDNA-OpenPrompt model, leveraging the novel OpenPrompt learning framework. The model combines a prompt template, prompt verbalizer, and Pre-trained Language Model (PLM) to construct the prompt-learning framework for DNA methylation sequences. Moreover, a DNA vocabulary library, BERT tokenizer, and specific label words are also introduced into the model to enable accurate identification of DNA methylation sites. An extensive analysis is conducted to evaluate the predictive, reliability, and consistency capabilities of the iDNA-OpenPrompt model. The experimental outcomes, covering 17 benchmark datasets that include various species and three DNA methylation modifications (4mC, 5hmC, 6mA), consistently indicate that our model surpasses outstanding performance and robustness approaches.

摘要

DNA甲基化是一种关键的表观遗传修饰,涉及在DNA分子上添加一个甲基基团,在不改变DNA序列的情况下调节基因表达中发挥关键作用。识别DNA甲基化位点的主要困难在于甲基化模式的微妙和复杂性,其可能在不同组织、发育阶段和环境条件下有所不同。传统的甲基化位点识别方法,如亚硫酸氢盐测序,通常劳动强度大、成本高,且需要大量DNA,阻碍了高通量分析。此外,这些方法可能并不总能提供检测特定位点甲基化所需的分辨率,尤其是在富含重复序列或甲基化水平较低的基因组区域。此外,当前的深度学习方法通常缺乏足够的准确性。本研究引入了iDNA-OpenPrompt模型,利用了新颖的OpenPrompt学习框架。该模型结合了提示模板、提示语言器和预训练语言模型(PLM)来构建DNA甲基化序列的提示学习框架。此外,还将一个DNA词汇库、BERT分词器和特定的标签词引入模型,以实现对DNA甲基化位点的准确识别。进行了广泛的分析以评估iDNA-OpenPrompt模型的预测、可靠性和一致性能力。涵盖17个基准数据集(包括各种物种和三种DNA甲基化修饰(4mC、5hmC、6mA))的实验结果一致表明,我们的模型超越了出色的性能和稳健性方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e1ac/11058834/f82870052eee/fgene-15-1377285-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e1ac/11058834/9378354d6b56/fgene-15-1377285-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e1ac/11058834/8628dc3d62ce/fgene-15-1377285-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e1ac/11058834/e2b8298e36ed/fgene-15-1377285-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e1ac/11058834/9974adf2b896/fgene-15-1377285-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e1ac/11058834/11a6957f7bd2/fgene-15-1377285-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e1ac/11058834/d84b37e6a8bd/fgene-15-1377285-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e1ac/11058834/82675f8bacd4/fgene-15-1377285-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e1ac/11058834/f82870052eee/fgene-15-1377285-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e1ac/11058834/9378354d6b56/fgene-15-1377285-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e1ac/11058834/8628dc3d62ce/fgene-15-1377285-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e1ac/11058834/e2b8298e36ed/fgene-15-1377285-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e1ac/11058834/9974adf2b896/fgene-15-1377285-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e1ac/11058834/11a6957f7bd2/fgene-15-1377285-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e1ac/11058834/d84b37e6a8bd/fgene-15-1377285-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e1ac/11058834/82675f8bacd4/fgene-15-1377285-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e1ac/11058834/f82870052eee/fgene-15-1377285-g008.jpg

相似文献

1
iDNA-OpenPrompt: OpenPrompt learning model for identifying DNA methylation.iDNA-OpenPrompt:用于识别DNA甲基化的OpenPrompt学习模型。
Front Genet. 2024 Apr 16;15:1377285. doi: 10.3389/fgene.2024.1377285. eCollection 2024.
2
iDNA-ABT: advanced deep learning model for detecting DNA methylation with adaptive features and transductive information maximization.iDNA-ABT:具有自适应特征和转导信息最大化的先进深度学习模型,用于检测 DNA 甲基化。
Bioinformatics. 2021 Dec 11;37(24):4603-4610. doi: 10.1093/bioinformatics/btab677.
3
iDNA-MT: Identification DNA Modification Sites in Multiple Species by Using Multi-Task Learning Based a Neural Network Tool.iDNA-MT:基于神经网络工具利用多任务学习识别多个物种中的DNA修饰位点
Front Genet. 2021 Mar 31;12:663572. doi: 10.3389/fgene.2021.663572. eCollection 2021.
4
DeepSF-4mC: A deep learning model for predicting DNA cytosine 4mC methylation sites leveraging sequence features.DeepSF-4mC:一种利用序列特征预测 DNA 胞嘧啶 4mC 甲基化位点的深度学习模型。
Comput Biol Med. 2024 Mar;171:108166. doi: 10.1016/j.compbiomed.2024.108166. Epub 2024 Feb 16.
5
iDNA-MS: An Integrated Computational Tool for Detecting DNA Modification Sites in Multiple Genomes.iDNA-MS:一种用于检测多个基因组中DNA修饰位点的综合计算工具。
iScience. 2020 Apr 24;23(4):100991. doi: 10.1016/j.isci.2020.100991. Epub 2020 Mar 19.
6
EpiTEAmDNA: Sequence feature representation via transfer learning and ensemble learning for identifying multiple DNA epigenetic modification types across species.EpiTEAmDNA:通过迁移学习和集成学习进行序列特征表示,以跨物种识别多种 DNA 表观遗传修饰类型。
Comput Biol Med. 2023 Jun;160:107030. doi: 10.1016/j.compbiomed.2023.107030. Epub 2023 May 11.
7
BERT2OME: Prediction of 2'-O-Methylation Modifications From RNA Sequence by Transformer Architecture Based on BERT.BERT2OME:基于 BERT 的 Transformer 架构对 RNA 序列进行 2'-O-甲基化修饰预测。
IEEE/ACM Trans Comput Biol Bioinform. 2023 May-Jun;20(3):2177-2189. doi: 10.1109/TCBB.2023.3237769.
8
4mCPred-MTL: Accurate Identification of DNA 4mC Sites in Multiple Species Using Multi-Task Deep Learning Based on Multi-Head Attention Mechanism.4mCPred-MTL:基于多头注意力机制的多任务深度学习准确识别多个物种中的DNA 4-甲基胞嘧啶位点
Front Cell Dev Biol. 2021 May 10;9:664669. doi: 10.3389/fcell.2021.664669. eCollection 2021.
9
Deep6mA: A deep learning framework for exploring similar patterns in DNA N6-methyladenine sites across different species.Deep6mA:一个用于探索不同物种中 DNA N6-甲基腺嘌呤位点相似模式的深度学习框架。
PLoS Comput Biol. 2021 Feb 18;17(2):e1008767. doi: 10.1371/journal.pcbi.1008767. eCollection 2021 Feb.
10
DeepTorrent: a deep learning-based approach for predicting DNA N4-methylcytosine sites.DeepTorrent:一种基于深度学习的方法,用于预测 DNA N4-甲基胞嘧啶位点。
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa124.

引用本文的文献

1
Artificial neural network-driven approaches to improved forecasting of disability care expenditures in an aging Kingdom of Saudi Arabia population.人工神经网络驱动的方法用于改善对沙特阿拉伯王国老龄化人口残疾护理支出的预测。
Sci Rep. 2025 Jul 1;15(1):20538. doi: 10.1038/s41598-025-05364-8.
2
Reducing M2 macrophage in lung fibrosis by controlling anti-M1 agent.通过控制抗M1因子减少肺纤维化中的M2巨噬细胞。
Sci Rep. 2025 Feb 3;15(1):4120. doi: 10.1038/s41598-024-76561-0.
3
An efficient smart phone application for wheat crop diseases detection using advanced machine learning.

本文引用的文献

1
DRSN4mCPred: accurately predicting sites of DNA N4-methylcytosine using deep residual shrinkage network for diagnosis and treatment of gastrointestinal cancer in the precision medicine era.DRSN4mCPred:在精准医学时代,使用深度残差收缩网络准确预测DNA N4-甲基胞嘧啶位点以用于胃肠道癌的诊断和治疗。
Front Med (Lausanne). 2023 May 4;10:1187430. doi: 10.3389/fmed.2023.1187430. eCollection 2023.
2
EpiTEAmDNA: Sequence feature representation via transfer learning and ensemble learning for identifying multiple DNA epigenetic modification types across species.EpiTEAmDNA:通过迁移学习和集成学习进行序列特征表示,以跨物种识别多种 DNA 表观遗传修饰类型。
Comput Biol Med. 2023 Jun;160:107030. doi: 10.1016/j.compbiomed.2023.107030. Epub 2023 May 11.
3
一种使用先进机器学习技术的用于小麦作物病害检测的高效智能手机应用程序。
PLoS One. 2025 Jan 8;20(1):e0312768. doi: 10.1371/journal.pone.0312768. eCollection 2025.
4
Strategic scheduling of the electric vehicle-based microgrids under the enhanced particle swarm optimization algorithm.基于增强粒子群优化算法的电动汽车微电网的战略调度
Sci Rep. 2024 Dec 28;14(1):30795. doi: 10.1038/s41598-024-81049-y.
5
Analyzing scRNA-seq data by CCP-assisted UMAP and tSNE.通过CCP辅助的UMAP和tSNE分析单细胞RNA测序数据。
PLoS One. 2024 Dec 13;19(12):e0311791. doi: 10.1371/journal.pone.0311791. eCollection 2024.
6
CAD-PsorNet: deep transfer learning for computer-assisted diagnosis of skin psoriasis.CAD-PsorNet:用于皮肤银屑病计算机辅助诊断的深度迁移学习。
Sci Rep. 2024 Nov 4;14(1):26557. doi: 10.1038/s41598-024-76852-6.
DeepBIO: an automated and interpretable deep-learning platform for high-throughput biological sequence prediction, functional annotation and visualization analysis.DeepBIO:一个自动化的、可解释的深度学习平台,用于高通量生物序列预测、功能注释和可视化分析。
Nucleic Acids Res. 2023 Apr 24;51(7):3017-3029. doi: 10.1093/nar/gkad055.
4
iDNA-ABF: multi-scale deep biological language learning model for the interpretable prediction of DNA methylations.iDNA-ABF:用于可解释的 DNA 甲基化预测的多尺度深度生物语言学习模型。
Genome Biol. 2022 Oct 17;23(1):219. doi: 10.1186/s13059-022-02780-1.
5
i6mA-Caps: a CapsuleNet-based framework for identifying DNA N6-methyladenine sites.i6mA-Caps:一种基于胶囊网络的 DNA N6-甲基腺嘌呤位点识别框架。
Bioinformatics. 2022 Aug 10;38(16):3885-3891. doi: 10.1093/bioinformatics/btac434.
6
Hyb4mC: a hybrid DNA2vec-based model for DNA N4-methylcytosine sites prediction.Hyb4mC:一种基于 DNA2vec 的混合模型,用于预测 DNA N4-甲基胞嘧啶位点。
BMC Bioinformatics. 2022 Jun 29;23(1):258. doi: 10.1186/s12859-022-04789-6.
7
scIMC: a platform for benchmarking comparison and visualization analysis of scRNA-seq data imputation methods.scIMC:用于基准测试、比较和可视化分析 scRNA-seq 数据插补方法的平台。
Nucleic Acids Res. 2022 May 20;50(9):4877-4899. doi: 10.1093/nar/gkac317.
8
BERT6mA: prediction of DNA N6-methyladenine site using deep learning-based approaches.BERT6mA:基于深度学习的方法预测 DNA N6-甲基腺嘌呤位点。
Brief Bioinform. 2022 Mar 10;23(2). doi: 10.1093/bib/bbac053.
9
BiLSTM-5mC: A Bidirectional Long Short-Term Memory-Based Approach for Predicting 5-Methylcytosine Sites in Genome-Wide DNA Promoters.基于双向长短时记忆网络(BiLSTM)的 5-甲基胞嘧啶(5mC)位点预测方法:全基因组 DNA 启动子研究
Molecules. 2021 Dec 7;26(24):7414. doi: 10.3390/molecules26247414.
10
iDNA-ABT: advanced deep learning model for detecting DNA methylation with adaptive features and transductive information maximization.iDNA-ABT:具有自适应特征和转导信息最大化的先进深度学习模型,用于检测 DNA 甲基化。
Bioinformatics. 2021 Dec 11;37(24):4603-4610. doi: 10.1093/bioinformatics/btab677.