• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用 Prompt 预测多种原核生物的启动子。

Predicting Promoters in Multiple Prokaryotes with Prompt.

机构信息

School of Engineering, Air-Space-Ground Integrated Intelligence and Big Data Application Engineering Research Center of Yunnan Provincial Department of Education, Dali University, Dali, 671003, China.

College of Biotechnology, Tianjin University of Science & Technology, Tianjin, 300457, China.

出版信息

Interdiscip Sci. 2024 Dec;16(4):814-828. doi: 10.1007/s12539-024-00637-8. Epub 2024 Aug 7.

DOI:10.1007/s12539-024-00637-8
PMID:39110340
Abstract

Promoters are important cis-regulatory elements for the regulation of gene expression, and their accurate predictions are crucial for elucidating the biological functions and potential mechanisms of genes. Many previous prokaryotic promoter prediction methods are encouraging in terms of the prediction performance, but most of them focus on the recognition of promoters in only one or a few bacterial species. Moreover, due to ignoring the promoter sequence motifs, the interpretability of predictions with existing methods is limited. In this work, we present a generalized method Prompt (Promoters in multiple prokaryotes) to predict promoters in 16 prokaryotes and improve the interpretability of prediction results. Prompt integrates three methods including RSK (Regression based on Selected k-mer), CL (Contrastive Learning) and MLP (Multilayer Perception), and employs a voting strategy to divide the datasets into high-confidence and low-confidence categories. Results on the promoter prediction tasks in 16 prokaryotes show that the accuracy (Accuracy, Matthews correlation coefficient) of Prompt is greater than 80% in highly credible datasets of 16 prokaryotes, and is greater than 90% in 12 prokaryotes, and Prompt performs the best compared with other existing methods. Moreover, by identifying promoter sequence motifs, Prompt can improve the interpretability of the predictions. Prompt is freely available at https://github.com/duqimeng/PromptPrompt , and will contribute to the research of promoters in prokaryote.

摘要

启动子是调节基因表达的重要顺式调控元件,准确预测启动子对于阐明基因的生物学功能和潜在机制至关重要。许多先前的原核启动子预测方法在预测性能方面令人鼓舞,但它们大多数都集中在识别一个或几个细菌物种中的启动子。此外,由于忽略了启动子序列基序,现有方法的预测结果的可解释性有限。在这项工作中,我们提出了一种通用方法 Prompt(多原核启动子)来预测 16 种原核生物中的启动子,并提高预测结果的可解释性。Prompt 集成了包括 RSK(基于选择的 k-mer 的回归)、CL(对比学习)和 MLP(多层感知机)在内的三种方法,并采用投票策略将数据集分为高可信度和低可信度两类。在 16 种原核生物的启动子预测任务上的结果表明,在 16 种原核生物的高可信度数据集上,Prompt 的准确率(Accuracy、马修斯相关系数)大于 80%,在 12 种原核生物上大于 90%,并且与其他现有方法相比表现最佳。此外,通过识别启动子序列基序,Prompt 可以提高预测结果的可解释性。Prompt 可在 https://github.com/duqimeng/Prompt 上免费获得,并将有助于原核生物中启动子的研究。

相似文献

1
Predicting Promoters in Multiple Prokaryotes with Prompt.利用 Prompt 预测多种原核生物的启动子。
Interdiscip Sci. 2024 Dec;16(4):814-828. doi: 10.1007/s12539-024-00637-8. Epub 2024 Aug 7.
2
An integrative and applicable phylogenetic footprinting framework for cis-regulatory motifs identification in prokaryotic genomes.一种用于原核生物基因组中顺式调控基序识别的综合且适用的系统发育足迹分析框架。
BMC Genomics. 2016 Aug 9;17:578. doi: 10.1186/s12864-016-2982-x.
3
GraphPro: An interpretable graph neural network-based model for identifying promoters in multiple species.GraphPro:一种基于可解释图神经网络的模型,用于识别多个物种中的启动子。
Comput Biol Med. 2024 Sep;180:108974. doi: 10.1016/j.compbiomed.2024.108974. Epub 2024 Aug 2.
4
G4PromFinder: an algorithm for predicting transcription promoters in GC-rich bacterial genomes based on AT-rich elements and G-quadruplex motifs.G4PromFinder:一种基于富含 AT 的元件和 G-四链体基序预测 GC 丰富型细菌基因组转录启动子的算法。
BMC Bioinformatics. 2018 Feb 6;19(1):36. doi: 10.1186/s12859-018-2049-x.
5
70ProPred: a predictor for discovering sigma70 promoters based on combining multiple features.70ProPred:一种基于多特征组合发现σ70启动子的预测工具。
BMC Syst Biol. 2018 Apr 24;12(Suppl 4):44. doi: 10.1186/s12918-018-0570-1.
6
PPD: A Manually Curated Database for Experimentally Verified Prokaryotic Promoters.PPD:一个经过人工整理的实验验证的原核启动子数据库。
J Mol Biol. 2021 May 28;433(11):166860. doi: 10.1016/j.jmb.2021.166860. Epub 2021 Feb 2.
7
Recognition of prokaryotic and eukaryotic promoters using convolutional deep learning neural networks.使用卷积深度学习神经网络识别原核生物和真核生物启动子。
PLoS One. 2017 Feb 3;12(2):e0171410. doi: 10.1371/journal.pone.0171410. eCollection 2017.
8
Promotech: a general tool for bacterial promoter recognition.Promotech:一种用于细菌启动子识别的通用工具。
Genome Biol. 2021 Nov 17;22(1):318. doi: 10.1186/s13059-021-02514-9.
9
Discovery of conserved motifs in promoters of orthologous genes in prokaryotes.原核生物直系同源基因启动子中保守基序的发现。
Methods Mol Biol. 2007;395:293-308. doi: 10.1007/978-1-59745-514-5_18.
10
Features for computational operon prediction in prokaryotes.原核生物计算操纵子预测的特征。
Brief Funct Genomics. 2012 Jul;11(4):291-9. doi: 10.1093/bfgp/els024. Epub 2012 Jun 28.

本文引用的文献

1
Deep flanking sequence engineering for efficient promoter design using DeepSEED.使用 DeepSEED 进行高效启动子设计的深侧翼序列工程。
Nat Commun. 2023 Oct 9;14(1):6309. doi: 10.1038/s41467-023-41899-y.
2
iPro-WAEL: a comprehensive and robust framework for identifying promoters in multiple species.iPro-WAEL:一种全面而强大的多物种启动子识别框架。
Nucleic Acids Res. 2022 Oct 14;50(18):10278-10289. doi: 10.1093/nar/gkac824.
3
Critical assessment of computational tools for prokaryotic and eukaryotic promoter prediction.原核生物和真核生物启动子预测的计算工具的批判性评估。
Brief Bioinform. 2022 Mar 10;23(2). doi: 10.1093/bib/bbab551.
4
Promotech: a general tool for bacterial promoter recognition.Promotech:一种用于细菌启动子识别的通用工具。
Genome Biol. 2021 Nov 17;22(1):318. doi: 10.1186/s13059-021-02514-9.
5
iPromoter-ET: Identifying promoters and their strength by extremely randomized trees-based feature selection.iPromoter-ET:通过基于极端随机树的特征选择识别启动子及其强度。
Anal Biochem. 2021 Oct 1;630:114335. doi: 10.1016/j.ab.2021.114335. Epub 2021 Aug 10.
6
PPD: A Manually Curated Database for Experimentally Verified Prokaryotic Promoters.PPD:一个经过人工整理的实验验证的原核启动子数据库。
J Mol Biol. 2021 May 28;433(11):166860. doi: 10.1016/j.jmb.2021.166860. Epub 2021 Feb 2.
7
pcPromoter-CNN: A CNN-Based Prediction and Classification of Promoters.pcPromoter-CNN:一种基于 CNN 的启动子预测和分类方法。
Genes (Basel). 2020 Dec 21;11(12):1529. doi: 10.3390/genes11121529.
8
Benchmarking Bacterial Promoter Prediction Tools: Potentialities and Limitations.细菌启动子预测工具的基准测试:潜力与局限
mSystems. 2020 Aug 25;5(4):e00439-20. doi: 10.1128/mSystems.00439-20.
9
Classifying Promoters by Interpreting the Hidden Information of DNA Sequences via Deep Learning and Combination of Continuous FastText N-Grams.通过深度学习以及连续快速文本N元语法的组合来解读DNA序列的隐藏信息对启动子进行分类
Front Bioeng Biotechnol. 2019 Nov 5;7:305. doi: 10.3389/fbioe.2019.00305. eCollection 2019.
10
iPromoter-2L2.0: Identifying Promoters and Their Types by Combining Smoothing Cutting Window Algorithm and Sequence-Based Features.iPromoter-2L2.0:结合平滑切割窗口算法和基于序列的特征识别启动子及其类型
Mol Ther Nucleic Acids. 2019 Dec 6;18:80-87. doi: 10.1016/j.omtn.2019.08.008. Epub 2019 Aug 14.