Suppr超能文献

从六核苷酸对的基因组分布中检测原核生物启动子。

Detection of prokaryotic promoters from the genomic distribution of hexanucleotide pairs.

作者信息

Jacques Pierre-Etienne, Rodrigue Sébastien, Gaudreau Luc, Goulet Jean, Brzezinski Ryszard

机构信息

Département de biologie, Université de Sherbrooke, Sherbrooke, Québec, Canada.

出版信息

BMC Bioinformatics. 2006 Oct 2;7:423. doi: 10.1186/1471-2105-7-423.

Abstract

BACKGROUND

In bacteria, sigma factors and other transcriptional regulatory proteins recognize DNA patterns upstream of their target genes and interact with RNA polymerase to control transcription. As a consequence of evolution, DNA sequences recognized by transcription factors are thought to be enriched in intergenic regions (IRs) and depleted from coding regions of prokaryotic genomes.

RESULTS

In this work, we report that genomic distribution of transcription factors binding sites is biased towards IRs, and that this bias is conserved amongst bacterial species. We further take advantage of this observation to develop an algorithm that can efficiently identify promoter boxes by a distribution-dependent approach rather than a direct sequence comparison approach. This strategy, which can easily be combined with other methodologies, allowed the identification of promoter sequences in ten species and can be used with any annotated bacterial genome, with results that rival with current methodologies. Experimental validations of predicted promoters also support our approach.

CONCLUSION

Considering that complete genomic sequences of over 1000 bacteria will soon be available and that little transcriptional information is available for most of them, our algorithm constitutes a promising tool for the prediction of promoter sequences. Importantly, our methodology could also be adapted to identify DNA sequences recognized by other regulatory proteins.

摘要

背景

在细菌中,σ因子和其他转录调节蛋白识别其靶基因上游的DNA模式,并与RNA聚合酶相互作用以控制转录。作为进化的结果,转录因子识别的DNA序列被认为在基因间区域(IRs)中富集,而在原核生物基因组的编码区域中则减少。

结果

在这项工作中,我们报告转录因子结合位点的基因组分布偏向于IRs,并且这种偏向在细菌物种中是保守的。我们进一步利用这一观察结果开发了一种算法,该算法可以通过依赖于分布的方法而不是直接序列比较方法有效地识别启动子框。这种策略可以很容易地与其他方法结合,能够识别十个物种中的启动子序列,并且可以用于任何已注释的细菌基因组,其结果与当前方法相当。对预测启动子的实验验证也支持我们的方法。

结论

考虑到很快将有超过1000种细菌的完整基因组序列可用,而其中大多数细菌的转录信息很少,我们的算法构成了一种有前途的预测启动子序列的工具。重要的是,我们的方法也可以适用于识别其他调节蛋白识别的DNA序列。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/68b2/1615881/4571dd7e7811/1471-2105-7-423-1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验