• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种基于序列和上下文的新型启动子识别方法。

A novel sequence and context based method for promoter recognition.

作者信息

P Umesh, Dubey Jitendra Kumar, Rv Karthika, Cherian Betsy Sheena, Gopalakrishnan Gopakumar, Nair Achuthsankar Sukumaran

机构信息

Department of Computational Biology and Bioinformatics, University of Kerala, Thiruvananthapuram - 695581, Kerala, India.

Department of Computer Science and Engineering, National Institute of Technology, Calicut - 673601, Kerala, India.

出版信息

Bioinformation. 2014 Apr 23;10(4):175-9. doi: 10.6026/97320630010175. eCollection 2014.

DOI:10.6026/97320630010175
PMID:24966516
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4070045/
Abstract

UNLABELLED

Identification of promoters in DNA sequence using computational techniques is a significant research area because of its direct association in transcription regulation. A wide range of algorithms are available for promoter prediction. Most of them are polymerase dependent and cannot handle eukaryotes and prokaryotes alike. This study proposes a polymerase independent algorithm, which can predict whether a given DNA fragment is a promoter or not, based on the sequence features and statistical elements. This algorithm considers all possible pentamers formed from the nucleotides A, C, G, and T along with CpG islands, TATA box, initiator elements, and downstream promoter elements. The highlight of the algorithm is that it is not polymerase specific and can predict for both eukaryotes and prokaryotes in the same computational manner even though the underlying biological mechanisms of promoter recognition differ greatly. The proposed Method, Promoter Prediction System - PPS-CBM achieved a sensitivity, specificity, and accuracy percentages of 75.08, 83.58 and 79.33 on E. coli data set and 86.67, 88.41 and 87.58 on human data set. We have developed a tool based on PPS-CBM, the proposed algorithm, with which multiple sequences of varying lengths can be tested simultaneously and the result is reported in a comprehensive tabular format. The tool also reports the strength of the prediction.

AVAILABILITY

The tool and source code of PPS-CBM is available at http://keralabs.org.

摘要

未标注

利用计算技术在DNA序列中识别启动子是一个重要的研究领域,因为它与转录调控直接相关。有多种算法可用于启动子预测。其中大多数依赖聚合酶,无法同等地处理真核生物和原核生物。本研究提出一种不依赖聚合酶的算法,该算法可基于序列特征和统计元素预测给定的DNA片段是否为启动子。该算法考虑了由核苷酸A、C、G和T形成的所有可能的五聚体,以及CpG岛、TATA盒、起始元件和下游启动子元件。该算法的亮点在于它不是聚合酶特异性的,即使启动子识别的潜在生物学机制差异很大,它也能以相同的计算方式对真核生物和原核生物进行预测。所提出的方法,即启动子预测系统 - PPS - CBM,在大肠杆菌数据集上的灵敏度、特异性和准确率分别达到75.08%、83.58%和79.33%,在人类数据集上分别为86.67%、88.41%和87.58%。我们基于所提出的算法PPS - CBM开发了一个工具,利用该工具可以同时测试多个不同长度的序列,并以综合表格形式报告结果。该工具还报告预测的强度。

可用性

PPS - CBM的工具和源代码可在http://keralabs.org获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/53f3/4070045/756ee6c7e5b3/97320630010175F1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/53f3/4070045/756ee6c7e5b3/97320630010175F1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/53f3/4070045/756ee6c7e5b3/97320630010175F1.jpg

相似文献

1
A novel sequence and context based method for promoter recognition.一种基于序列和上下文的新型启动子识别方法。
Bioinformation. 2014 Apr 23;10(4):175-9. doi: 10.6026/97320630010175. eCollection 2014.
2
Weight matrix descriptions of four eukaryotic RNA polymerase II promoter elements derived from 502 unrelated promoter sequences.从502个不相关的启动子序列中得出的四种真核生物RNA聚合酶II启动子元件的权重矩阵描述。
J Mol Biol. 1990 Apr 20;212(4):563-78. doi: 10.1016/0022-2836(90)90223-9.
3
Characterization of transcription from TATA-less promoters: identification of a new core promoter element XCPE2 and analysis of factor requirements.无TATA框启动子转录的特征分析:新型核心启动子元件XCPE2的鉴定及因子需求分析
PLoS One. 2009;4(4):e5103. doi: 10.1371/journal.pone.0005103. Epub 2009 Apr 1.
4
G4PromFinder: an algorithm for predicting transcription promoters in GC-rich bacterial genomes based on AT-rich elements and G-quadruplex motifs.G4PromFinder:一种基于富含 AT 的元件和 G-四链体基序预测 GC 丰富型细菌基因组转录启动子的算法。
BMC Bioinformatics. 2018 Feb 6;19(1):36. doi: 10.1186/s12859-018-2049-x.
5
Computational identification and experimental characterization of preferred downstream positions in human core promoters.计算识别和实验鉴定人类核心启动子中优选下游位置。
PLoS Comput Biol. 2021 Aug 12;17(8):e1009256. doi: 10.1371/journal.pcbi.1009256. eCollection 2021 Aug.
6
GPMiner: an integrated system for mining combinatorial cis-regulatory elements in mammalian gene group.GPMiner:一个用于挖掘哺乳动物基因组合调控元件的集成系统。
BMC Genomics. 2012;13 Suppl 1(Suppl 1):S3. doi: 10.1186/1471-2164-13-S1-S3. Epub 2012 Jan 17.
7
ElemeNT: a computational tool for detecting core promoter elements.ElemeNT:一种用于检测核心启动子元件的计算工具。
Transcription. 2015;6(3):41-50. doi: 10.1080/21541264.2015.1067286.
8
The features of Drosophila core promoters revealed by statistical analysis.通过统计分析揭示的果蝇核心启动子特征。
BMC Genomics. 2006 Jun 21;7:161. doi: 10.1186/1471-2164-7-161.
9
Transcriptional activation by simian virus 40 large T antigen: requirements for simple promoter structures containing either TATA or initiator elements with variable upstream factor binding sites.猿猴病毒40大T抗原的转录激活:对含有TATA或起始子元件以及可变上游因子结合位点的简单启动子结构的要求。
J Virol. 1993 Nov;67(11):6682-8. doi: 10.1128/JVI.67.11.6682-6688.1993.
10
DNA sequence and structural properties as predictors of human and mouse promoters.作为人类和小鼠启动子预测指标的DNA序列及结构特性
Gene. 2008 Feb 29;410(1):165-76. doi: 10.1016/j.gene.2007.12.011. Epub 2007 Dec 23.

本文引用的文献

1
Programming languages for synthetic biology.用于合成生物学的编程语言。
Syst Synth Biol. 2010 Dec;4(4):265-9. doi: 10.1007/s11693-011-9070-y. Epub 2011 Feb 20.
2
Recognition of prokaryotic promoters based on a novel variable-window Z-curve method.基于新型可变窗口 Z 曲线方法的原核启动子识别。
Nucleic Acids Res. 2012 Feb;40(3):963-71. doi: 10.1093/nar/gkr795. Epub 2011 Sep 27.
3
Annotation of gene promoters by integrative data-mining of ChIP-seq Pol-II enrichment data.通过整合 ChIP-seq Pol-II 富集数据的数据挖掘对基因启动子进行注释。
BMC Bioinformatics. 2010 Jan 18;11 Suppl 1(Suppl 1):S65. doi: 10.1186/1471-2105-11-S1-S65.
4
A universal approach for promoter strength evaluation supported by the web-based tool PromCal.基于网络工具 PromCal 的启动子强度评估通用方法。
Anal Biochem. 2010 Jan 1;396(1):83-90. doi: 10.1016/j.ab.2009.08.033. Epub 2009 Aug 29.
5
The Eukaryotic Promoter Database, EPD: new entry types and links to gene expression data.真核生物启动子数据库(EPD):新的条目类型及与基因表达数据的链接。
Nucleic Acids Res. 2002 Jan 1;30(1):322-4. doi: 10.1093/nar/30.1.322.
6
EcoGene: a genome sequence database for Escherichia coli K-12.EcoGene:大肠杆菌K-12的基因组序列数据库。
Nucleic Acids Res. 2000 Jan 1;28(1):60-4. doi: 10.1093/nar/28.1.60.
7
The biology of eukaryotic promoter prediction--a review.真核生物启动子预测的生物学——综述
Comput Chem. 1999 Jun 15;23(3-4):191-207. doi: 10.1016/s0097-8485(99)00015-7.
8
Interpolated markov chains for eukaryotic promoter recognition.用于真核生物启动子识别的内插马尔可夫链
Bioinformatics. 1999 May;15(5):362-9. doi: 10.1093/bioinformatics/15.5.362.
9
Identification of human gene core promoters in silico.在计算机上鉴定人类基因核心启动子。
Genome Res. 1998 Mar;8(3):319-26. doi: 10.1101/gr.8.3.319.
10
RegulonDB: a database on transcriptional regulation in Escherichia coli.调控基因数据库:大肠杆菌转录调控数据库
Nucleic Acids Res. 1998 Jan 1;26(1):55-9. doi: 10.1093/nar/26.1.55.