• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一个成功的混合深度学习模型,旨在进行启动子识别。

A successful hybrid deep learning model aiming at promoter identification.

机构信息

Systems Engineering Institute, Xi'an Jiaotong University, Xi'an, China.

出版信息

BMC Bioinformatics. 2022 May 31;23(Suppl 1):206. doi: 10.1186/s12859-022-04735-6.

DOI:10.1186/s12859-022-04735-6
PMID:35641900
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9158169/
Abstract

BACKGROUND

The zone adjacent to a transcription start site (TSS), namely, the promoter, is primarily involved in the process of DNA transcription initiation and regulation. As a result, proper promoter identification is critical for further understanding the mechanism of the networks controlling genomic regulation. A number of methodologies for the identification of promoters have been proposed. Nonetheless, due to the great heterogeneity existing in promoters, the results of these procedures are still unsatisfactory. In order to establish additional discriminative characteristics and properly recognize promoters, we developed the hybrid model for promoter identification (HMPI), a hybrid deep learning model that can characterize both the native sequences of promoters and the morphological outline of promoters at the same time. We developed the HMPI to combine a method called the PSFN (promoter sequence features network), which characterizes native promoter sequences and deduces sequence features, with a technique referred to as the DSPN (deep structural profiles network), which is specially structured to model the promoters in terms of their structural profile and to deduce their structural attributes.

RESULTS

The HMPI was applied to human, plant and Escherichia coli K-12 strain datasets, and the findings showed that the HMPI was successful at extracting the features of the promoter while greatly enhancing the promoter identification performance. In addition, after the improvements of synthetic sampling, transfer learning and label smoothing regularization, the improved HMPI models achieved good results in identifying subtypes of promoters on prokaryotic promoter datasets.

CONCLUSIONS

The results showed that the HMPI was successful at extracting the features of promoters while greatly enhancing the performance of identifying promoters on both eukaryotic and prokaryotic datasets, and the improved HMPI models are good at identifying subtypes of promoters on prokaryotic promoter datasets. The HMPI is additionally adaptable to different biological functional sequences, allowing for the addition of new features or models.

摘要

背景

转录起始位点(TSS)附近的区域,即启动子,主要参与 DNA 转录起始和调控过程。因此,正确识别启动子对于进一步理解控制基因组调控的网络机制至关重要。已经提出了许多用于识别启动子的方法。然而,由于启动子存在很大的异质性,这些方法的结果仍然不尽如人意。为了建立额外的有区分性的特征并正确识别启动子,我们开发了启动子识别的混合模型(HMPI),这是一种混合深度学习模型,可以同时描述启动子的固有序列和启动子的形态轮廓。我们开发了 HMPI,将一种称为 PSFN(启动子序列特征网络)的方法与一种称为 DSPN(深度结构轮廓网络)的技术相结合,PSFN 用于描述固有启动子序列并推导出序列特征,而 DSPN 则专门用于根据结构轮廓对启动子进行建模并推导出它们的结构属性。

结果

HMPI 应用于人类、植物和大肠杆菌 K-12 菌株数据集,结果表明,HMPI 成功地提取了启动子的特征,同时大大提高了启动子识别性能。此外,经过合成采样、迁移学习和标签平滑正则化的改进,改进后的 HMPI 模型在识别原核启动子数据集的启动子亚型方面取得了良好的效果。

结论

结果表明,HMPI 成功地提取了启动子的特征,同时大大提高了真核和原核数据集上启动子识别的性能,改进后的 HMPI 模型在识别原核启动子数据集的启动子亚型方面表现良好。HMPI 还适应不同的生物功能序列,允许添加新的特征或模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/15bd/9158169/bc3faf65d02c/12859_2022_4735_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/15bd/9158169/5a1c24374113/12859_2022_4735_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/15bd/9158169/10e434e169ce/12859_2022_4735_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/15bd/9158169/1859f69733f9/12859_2022_4735_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/15bd/9158169/18b8af870b30/12859_2022_4735_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/15bd/9158169/bc3faf65d02c/12859_2022_4735_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/15bd/9158169/5a1c24374113/12859_2022_4735_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/15bd/9158169/10e434e169ce/12859_2022_4735_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/15bd/9158169/1859f69733f9/12859_2022_4735_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/15bd/9158169/18b8af870b30/12859_2022_4735_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/15bd/9158169/bc3faf65d02c/12859_2022_4735_Fig5_HTML.jpg

相似文献

1
A successful hybrid deep learning model aiming at promoter identification.一个成功的混合深度学习模型,旨在进行启动子识别。
BMC Bioinformatics. 2022 May 31;23(Suppl 1):206. doi: 10.1186/s12859-022-04735-6.
2
Recognition of prokaryotic and eukaryotic promoters using convolutional deep learning neural networks.使用卷积深度学习神经网络识别原核生物和真核生物启动子。
PLoS One. 2017 Feb 3;12(2):e0171410. doi: 10.1371/journal.pone.0171410. eCollection 2017.
3
Identification of prokaryotic promoters and their strength by integrating heterogeneous features.通过整合异质特征来识别原核启动子及其强度。
Genomics. 2020 Mar;112(2):1396-1403. doi: 10.1016/j.ygeno.2019.08.009. Epub 2019 Aug 19.
4
GraphPro: An interpretable graph neural network-based model for identifying promoters in multiple species.GraphPro:一种基于可解释图神经网络的模型,用于识别多个物种中的启动子。
Comput Biol Med. 2024 Sep;180:108974. doi: 10.1016/j.compbiomed.2024.108974. Epub 2024 Aug 2.
5
TSSUNet-MB - ab initio identification of σ promoter transcription start sites in Escherichia coli using deep multitask learning.TSSUNet-MB - 使用深度多任务学习从头鉴定大肠杆菌中 σ 启动子转录起始位点。
Comput Biol Chem. 2023 Aug;105:107904. doi: 10.1016/j.compbiolchem.2023.107904. Epub 2023 Jun 10.
6
Computational identification of eukaryotic promoters based on cascaded deep capsule neural networks.基于级联深度胶囊神经网络的真核启动子计算识别。
Brief Bioinform. 2021 Jul 20;22(4). doi: 10.1093/bib/bbaa299.
7
Species-specific design of artificial promoters by transfer-learning based generative deep-learning model.基于迁移学习的生成式深度学习模型的物种特异性人工启动子设计。
Nucleic Acids Res. 2024 Jun 24;52(11):6145-6157. doi: 10.1093/nar/gkae429.
8
Eukaryotic and prokaryotic promoter prediction using hybrid approach.使用混合方法进行真核和原核启动子预测。
Theory Biosci. 2011 Jun;130(2):91-100. doi: 10.1007/s12064-010-0114-8. Epub 2010 Nov 3.
9
Design of synthetic promoters for cyanobacteria with generative deep-learning model.基于生成式深度学习模型的蓝藻合成启动子设计。
Nucleic Acids Res. 2023 Jul 21;51(13):7071-7082. doi: 10.1093/nar/gkad451.
10
Prokaryotic and eukaryotic promoters identification based on residual network transfer learning.基于残差网络迁移学习的原核生物和真核生物启动子识别
Bioprocess Biosyst Eng. 2022 May;45(5):955-967. doi: 10.1007/s00449-022-02716-w. Epub 2022 Mar 13.

引用本文的文献

1
Deep learning approaches for non-coding genetic variant effect prediction: current progress and future prospects.深度学习方法在非编码遗传变异效应预测中的应用:当前进展与未来展望。
Brief Bioinform. 2024 Jul 25;25(5). doi: 10.1093/bib/bbae446.
2
Predmoter-cross-species prediction of plant promoter and enhancer regions.植物启动子和增强子区域的启动子跨物种预测
Bioinform Adv. 2024 May 24;4(1):vbae074. doi: 10.1093/bioadv/vbae074. eCollection 2024.
3
From tradition to innovation: conventional and deep learning frameworks in genome annotation.

本文引用的文献

1
Accuracy, robustness and scalability of dimensionality reduction methods for single-cell RNA-seq analysis.单细胞 RNA-seq 分析中降维方法的准确性、鲁棒性和可扩展性。
Genome Biol. 2019 Dec 10;20(1):269. doi: 10.1186/s13059-019-1898-6.
2
Semi-Supervised Adversarial Monocular Depth Estimation.半监督对抗式单目深度估计
IEEE Trans Pattern Anal Mach Intell. 2020 Oct;42(10):2410-2422. doi: 10.1109/TPAMI.2019.2936024. Epub 2019 Aug 20.
3
MULTiPly: a novel multi-layer predictor for discovering general and specific types of promoters.
从传统到创新:基因组注释中的常规和深度学习框架。
Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae138.
4
Core promoterome of barley embryo.大麦胚的核心启动子组
Comput Struct Biotechnol J. 2023 Dec 5;23:264-277. doi: 10.1016/j.csbj.2023.12.003. eCollection 2024 Dec.
5
iEnhancer-DCSV: Predicting enhancers and their strength based on DenseNet and improved convolutional block attention module.iEnhancer-DCSV:基于密集连接网络(DenseNet)和改进的卷积块注意力模块预测增强子及其强度
Front Genet. 2023 Mar 1;14:1132018. doi: 10.3389/fgene.2023.1132018. eCollection 2023.
MULTiPly:一种用于发现通用和特定类型启动子的新型多层预测器。
Bioinformatics. 2019 Sep 1;35(17):2957-2965. doi: 10.1093/bioinformatics/btz016.
4
PromoterPredict: sequence-based modelling of σ promoter strength yields logarithmic dependence between promoter strength and sequence.启动子预测:基于序列的σ启动子强度建模得出启动子强度与序列之间的对数依赖性。
PeerJ. 2018 Nov 7;6:e5862. doi: 10.7717/peerj.5862. eCollection 2018.
5
Eukaryotic core promoters and the functional basis of transcription initiation.真核生物核心启动子和转录起始的功能基础。
Nat Rev Mol Cell Biol. 2018 Oct;19(10):621-637. doi: 10.1038/s41580-018-0028-8.
6
iPromoter-2L: a two-layer predictor for identifying promoters and their types by multi-window-based PseKNC.iPromoter-2L:一种双层预测器,通过基于多窗口的 PseKNC 来识别启动子及其类型。
Bioinformatics. 2018 Jan 1;34(1):33-40. doi: 10.1093/bioinformatics/btx579.
7
Differential expression analysis for RNAseq using Poisson mixed models.使用泊松混合模型对RNA测序数据进行差异表达分析。
Nucleic Acids Res. 2017 Jun 20;45(11):e106. doi: 10.1093/nar/gkx204.
8
Recognition of prokaryotic and eukaryotic promoters using convolutional deep learning neural networks.使用卷积深度学习神经网络识别原核生物和真核生物启动子。
PLoS One. 2017 Feb 3;12(2):e0171410. doi: 10.1371/journal.pone.0171410. eCollection 2017.
9
TSSPlant: a new tool for prediction of plant Pol II promoters.TSSPlant:一种预测植物RNA聚合酶II启动子的新工具。
Nucleic Acids Res. 2017 May 5;45(8):e65. doi: 10.1093/nar/gkw1353.
10
SD-MSAEs: Promoter recognition in human genome based on deep feature extraction.SD-MSAEs:基于深度特征提取的人类基因组启动子识别
J Biomed Inform. 2016 Jun;61:55-62. doi: 10.1016/j.jbi.2016.03.018. Epub 2016 Mar 24.