Suppr超能文献

通过全基因组分析对细菌限制修饰系统及相关分类学问题的分析

Analysis of bacterial RM-systems through genome-scale analysis and related taxonomy issues.

作者信息

Vandenbogaert Mathias, Makeev Vsevolod

机构信息

INRIA Rocquencourt - LaBRI Bordeaux I, Domaine de Voluceau, Le Chesnay 78153, France.

出版信息

In Silico Biol. 2003;3(1-2):127-43. Epub 2003 Mar 16.

Abstract

Recognition sites for type II restriction and modification enzymes in genomes of several bacteria are recognized as semi-palindromic motifs and are avoided at a significant degree. The key idea of contrast word analysis with respect to RMS recognition sites, is that under-represented words are likely to be selected against. Starting from over- or underrepresented words corresponding to RMS recognition sites in specific clades, the specificity of unknown R-M systems can be highlighted. Among the known restriction enzymes, that are described in the REBASE database of restriction and modification systems, many of their recognition sites are still uncharacterized. Eventually, this motivates studies aimed at assessing horizontal transferring events of RMS in micro-organisms through the analysis of word usage biases in well-determined genomic regions. A probabilistic model is built on a first-order Markovian chain. Statistics on the k-neighborhood of a word is carried out to assess the biological significance of a genomic motif. Efficient word counting procedures have been implemented and statistics are used for the assessment of the significance of individual words in large sequences. On the basis of the set of most avoided words, and in accordance to the IUPAC coding standards, suggestions are made regarding potential recognition sequences. In certain cases, a comparison of avoided palindromic words in taxonomically related bacteria shows a pattern of relatedness of their R-M systems. For strengthening this analysis, the primary protein structure of all type II R-M systems known in REBASE have been blasted against the nr-GENBANK database. The combination of these analyses has revealed some interesting examples of possible horizontal transfer events of R-M systems.

摘要

几种细菌基因组中II型限制与修饰酶的识别位点被认为是半回文基序,并且在很大程度上被避免。关于RMS识别位点的对比词分析的关键思想是,代表性不足的词可能会被选择淘汰。从特定进化枝中与RMS识别位点相对应的代表性过高或过低的词开始,可以突出未知R-M系统的特异性。在限制与修饰系统的REBASE数据库中描述的已知限制酶中,它们的许多识别位点仍未被表征。最终,这促使人们通过分析特定基因组区域中词的使用偏差来评估微生物中RMS的水平转移事件。基于一阶马尔可夫链建立概率模型。对一个词的k邻域进行统计,以评估基因组基序的生物学意义。已经实施了有效的词计数程序,并使用统计数据来评估大序列中单个词的重要性。根据最常被避免的词集,并按照IUPAC编码标准,对潜在的识别序列提出建议。在某些情况下,对分类学相关细菌中被避免的回文词的比较显示了它们的R-M系统的相关性模式。为了加强这一分析,已将REBASE中已知的所有II型R-M系统的一级蛋白质结构与nr-GENBANK数据库进行了比对。这些分析的结合揭示了一些R-M系统可能发生水平转移事件的有趣例子。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验