• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于密集连接卷积神经网络的拟南芥启动子预测。

Promoter prediction in nannochloropsis based on densely connected convolutional neural networks.

机构信息

Institutes of Physical Science and Information Technology, Anhui University, Hefei, China.

Anhui Provincial Key Laboratory of Multimodal Cognitive Computation, School of Computer Science and Technology, Anhui University, Hefei, China.

出版信息

Methods. 2022 Aug;204:38-46. doi: 10.1016/j.ymeth.2022.03.017. Epub 2022 Mar 31.

DOI:10.1016/j.ymeth.2022.03.017
PMID:35367367
Abstract

Promoter is a key DNA element located near the transcription start site, which regulates gene transcription by binding RNA polymerase. Thus, the identification of promoters is an important research field in synthetic biology. Nannochloropsis is an important unicellular industrial oleaginous microalgae, and at present, some studies have identified some promoters with specific functions by biological methods in Nannochloropsis, whereas few studies used computational methods. Here, we propose a method called DNPPro (DenseNet-Predict-Promoter) based on densely connected convolutional neural networks to predict the promoter of Nannochloropsis. First, we collected promoter sequences from six Nannochloropsis strains and removed 80% similarity using CD-HIT for each strain to yield a reliable set of positive datasets. Then, in order to construct a robust classifier, within-group scrambling method was used to generate negative dataset which overcomes the limitation of randomly selecting a non-promoter region from the same genome as a negative sample. Finally, we constructed a densely connected convolutional neural network, with the sequence one-hot encoding as the input. Compared with commonly used sequence processing methods, DNPPro can extract long sequence features to a greater extent. The cross-strain experiment on independent dataset verifies the generalization of our method. At the same time, T-SNE visualization analysis shows that our method can effectively distinguish promoters from non-promoters.

摘要

启动子是位于转录起始位点附近的关键 DNA 元件,通过与 RNA 聚合酶结合来调节基因转录。因此,启动子的鉴定是合成生物学的一个重要研究领域。微拟球藻是一种重要的单细胞工业产油微藻,目前已经通过生物方法在微拟球藻中鉴定出了一些具有特定功能的启动子,而很少有研究使用计算方法。在这里,我们提出了一种基于密集连接卷积神经网络的方法 DNPPro(DenseNet-Predict-Promoter),用于预测微拟球藻的启动子。首先,我们从六个微拟球藻菌株中收集启动子序列,并使用 CD-HIT 对每个菌株去除 80%的相似性,以获得一组可靠的阳性数据集。然后,为了构建一个稳健的分类器,我们使用组内随机打乱方法生成负数据集,克服了从同一基因组中随机选择非启动子区域作为负样本的局限性。最后,我们构建了一个密集连接的卷积神经网络,以序列的 one-hot 编码作为输入。与常用的序列处理方法相比,DNPPro 可以更大程度地提取长序列特征。在独立数据集上的跨菌株实验验证了我们方法的泛化能力。同时,T-SNE 可视化分析表明,我们的方法可以有效地将启动子与非启动子区分开来。

相似文献

1
Promoter prediction in nannochloropsis based on densely connected convolutional neural networks.基于密集连接卷积神经网络的拟南芥启动子预测。
Methods. 2022 Aug;204:38-46. doi: 10.1016/j.ymeth.2022.03.017. Epub 2022 Mar 31.
2
GraphPro: An interpretable graph neural network-based model for identifying promoters in multiple species.GraphPro:一种基于可解释图神经网络的模型,用于识别多个物种中的启动子。
Comput Biol Med. 2024 Sep;180:108974. doi: 10.1016/j.compbiomed.2024.108974. Epub 2024 Aug 2.
3
Computational identification of eukaryotic promoters based on cascaded deep capsule neural networks.基于级联深度胶囊神经网络的真核启动子计算识别。
Brief Bioinform. 2021 Jul 20;22(4). doi: 10.1093/bib/bbaa299.
4
Integrating distal and proximal information to predict gene expression via a densely connected convolutional neural network.通过密集连接卷积神经网络整合远端和近端信息来预测基因表达。
Bioinformatics. 2020 Jan 15;36(2):496-503. doi: 10.1093/bioinformatics/btz562.
5
iProm-Zea: A two-layer model to identify plant promoters and their types using convolutional neural network.iProm-Zea:一种使用卷积神经网络识别植物启动子及其类型的两层模型。
Genomics. 2022 May;114(3):110384. doi: 10.1016/j.ygeno.2022.110384. Epub 2022 May 6.
6
A successful hybrid deep learning model aiming at promoter identification.一个成功的混合深度学习模型,旨在进行启动子识别。
BMC Bioinformatics. 2022 May 31;23(Suppl 1):206. doi: 10.1186/s12859-022-04735-6.
7
iPromoter-CLA: Identifying promoters and their strength by deep capsule networks with bidirectional long short-term memory.iPromoter-CLA:通过具有双向长短时记忆的深度胶囊网络识别启动子及其强度。
Comput Methods Programs Biomed. 2022 Nov;226:107087. doi: 10.1016/j.cmpb.2022.107087. Epub 2022 Aug 28.
8
pcPromoter-CNN: A CNN-Based Prediction and Classification of Promoters.pcPromoter-CNN:一种基于 CNN 的启动子预测和分类方法。
Genes (Basel). 2020 Dec 21;11(12):1529. doi: 10.3390/genes11121529.
9
iPro2L-DG: Hybrid network based on improved densenet and global attention mechanism for identifying promoter sequences.iPro2L-DG:基于改进型密集连接网络和全局注意力机制的混合网络用于识别启动子序列。
Heliyon. 2024 Mar 6;10(6):e27364. doi: 10.1016/j.heliyon.2024.e27364. eCollection 2024 Mar 30.
10
iProL: identifying DNA promoters from sequence information based on Longformer pre-trained model.iProL:基于 Longformer 预训练模型从序列信息中识别 DNA 启动子。
BMC Bioinformatics. 2024 Jun 25;25(1):224. doi: 10.1186/s12859-024-05849-9.

引用本文的文献

1
iKcr-DRC: prediction of lysine crotonylation sites in proteins based on a novel attention module and DenseNet.iKcr-DRC:基于新型注意力模块和密集连接网络的蛋白质赖氨酸巴豆酰化位点预测
Front Genet. 2025 Jun 11;16:1574832. doi: 10.3389/fgene.2025.1574832. eCollection 2025.
2
Enhanced Eicosapentaenoic Acid Production via Synthetic Biological Strategy in .通过合成生物学策略在……中提高二十碳五烯酸的产量
Mar Drugs. 2024 Dec 19;22(12):570. doi: 10.3390/md22120570.
3
Recognition of cyanobacteria promoters via Siamese network-based contrastive learning under novel non-promoter generation.
基于暹罗网络的对比学习在新型非启动子生成下对蓝藻启动子的识别
Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae193.
4
iPro2L-DG: Hybrid network based on improved densenet and global attention mechanism for identifying promoter sequences.iPro2L-DG:基于改进型密集连接网络和全局注意力机制的混合网络用于识别启动子序列。
Heliyon. 2024 Mar 6;10(6):e27364. doi: 10.1016/j.heliyon.2024.e27364. eCollection 2024 Mar 30.
5
i5mC-DCGA: an improved hybrid network framework based on the CBAM attention mechanism for identifying promoter 5mC sites.i5mC-DCGA:一种基于 CBAM 注意力机制的改进型混合网络框架,用于识别启动子 5mC 位点。
BMC Genomics. 2024 Mar 5;25(1):242. doi: 10.1186/s12864-024-10154-z.
6
CircPCBL: Identification of Plant CircRNAs with a CNN-BiGRU-GLT Model.环状PCBL:使用CNN-BiGRU-GLT模型鉴定植物环状RNA
Plants (Basel). 2023 Apr 14;12(8):1652. doi: 10.3390/plants12081652.