Suppr超能文献

大肠杆菌K-12全基因组序列中转录调控位点的预测

Prediction of transcriptional regulatory sites in the complete genome sequence of Escherichia coli K-12.

作者信息

Thieffry D, Salgado H, Huerta A M, Collado-Vides J

机构信息

Centro de Investigación sobre Fijación de Nitrógeno, Universidad Nacional Autónoma de México, AP 565-A, Cuernavaca, Morelos 62100, México.

出版信息

Bioinformatics. 1998 Jun;14(5):391-400. doi: 10.1093/bioinformatics/14.5.391.

Abstract

MOTIVATION

As one of the best-characterized free-living organisms, Escherichia coli and its recently completed genomic sequence offer a special opportunity to exploit systematically the variety of regulatory data available in the literature in order to make a comprehensive set of regulatory predictions in the whole genome.

RESULTS

The complete genome sequence of E.coli was analyzed for the binding of transcriptional regulators upstream of coding sequences. The biological information contained in RegulonDB (Huerta, A.M. et al., Nucleic Acids Res.,26,55-60, 1998) for 56 different transcriptional proteins was the support to implement a stringent strategy combining string search and weight matrices. We estimate that our search included representatives of 15-25% of the total number of regulatory binding proteins in E.coli. This search was performed on the set of 4288 putative regulatory regions, each 450 bp long. Within the regions with predicted sites, 89% are regulated by one protein and 81% involve only one site. These numbers are reasonably consistent with the distribution of experimental regulatory sites. Regulatory sites are found in 603 regions corresponding to 16% of operon regions and 10% of intra-operonic regions. Additional evidence gives stronger support to some of these predictions, including the position of the site, biological consistency with the function of the downstream gene, as well as genetic evidence for the regulatory interaction. The predictions described here were incorporated into the map presented in the paper describing the complete E.coli genome (Blattner,F.R. et al., Science, 277, 1453-1461, 1997).

AVAILABILITY

The complete set of predictions in GenBank format is available at the url: http://www. cifn.unam.mx/Computational_Biology/E.coli-predictions

CONTACT

ecoli-reg@cifn.unam.mx, collado@cifn.unam.mx

摘要

动机

作为特征描述最为详尽的自由生活生物体之一,大肠杆菌及其最近完成的基因组序列提供了一个特殊机会,可系统利用文献中可用的各种调控数据,以便在全基因组范围内进行一套全面的调控预测。

结果

对大肠杆菌的完整基因组序列进行了分析,以确定编码序列上游转录调节因子的结合情况。RegulonDB(Huerta,A.M.等人,《核酸研究》,26,55 - 60,1998)中包含的56种不同转录蛋白的生物学信息是实施结合字符串搜索和权重矩阵的严格策略的依据。我们估计我们的搜索涵盖了大肠杆菌中调控结合蛋白总数的15% - 25%。此搜索在4288个假定调控区域的集合上进行,每个区域长450 bp。在具有预测位点的区域内,89%由一种蛋白质调控,81%仅涉及一个位点。这些数字与实验调控位点的分布相当一致。在对应于16%的操纵子区域和10%的操纵子内区域的603个区域中发现了调控位点。其他证据为其中一些预测提供了更强的支持,包括位点的位置、与下游基因功能的生物学一致性以及调控相互作用的遗传证据。此处描述的预测已纳入描述大肠杆菌完整基因组的论文中呈现的图谱(Blattner,F.R.等人,《科学》,277,1453 - 1461,1997)。

可用性

GenBank格式的完整预测集可在以下网址获取:http://www. cifn.unam.mx/Computational_Biology/E.coli-predictions

联系方式

ecoli-reg@cifn.unam.mxcollado@cifn.unam.mx

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验