N. I. Vavilov Institute of General Genetics, Russian Academy of Sciences, ul. Gubkina 3, Moscow 119991, Russia.
Appl Environ Microbiol. 2010 Apr;76(7):2136-44. doi: 10.1128/AEM.01985-09. Epub 2010 Jan 29.
Clustered regularly interspaced short palindromic repeats (CRISPRs) form a recently characterized type of prokaryotic antiphage defense system. The phage-host interactions involving CRISPRs have been studied in experiments with selected bacterial or archaeal species and, computationally, in completely sequenced genomes. However, these studies do not allow one to take prokaryotic population diversity and phage-host interaction dynamics into account. This gap can be filled by using metagenomic data: in particular, the largest existing data set, generated from the Sorcerer II Global Ocean Sampling expedition. The application of three publicly available CRISPR recognition programs to the Global Ocean metagenome produced a large proportion of false-positive results. To address this problem, a filtering procedure was designed. It resulted in about 200 reliable CRISPR cassettes, which were then studied in detail. The repeat consensuses were clustered into several stable classes that differed from the existing classification. Short fragments of DNA similar to the cassette spacers were more frequently present in the same geographical location than in other locations (P, <0.0001). We developed a catalogue of elementary CRISPR-forming events and reconstructed the likely evolutionary history of cassettes that had common spacers. Metagenomic collections allow for relatively unbiased analysis of phage-host interactions and CRISPR evolution. The results of this study demonstrate that CRISPR cassettes retain the memory of the local virus population at a particular ocean location. CRISPR evolution may be described using a limited vocabulary of elementary events that have a natural biological interpretation.
成簇规律间隔短回文重复序列(CRISPRs)构成了一种最近被描述的原核噬菌体防御系统。涉及 CRISPR 的噬菌体-宿主相互作用已经在针对选定的细菌或古细菌物种的实验中以及在完全测序的基因组中进行了计算研究。然而,这些研究不能将原核种群多样性和噬菌体-宿主相互作用动态考虑在内。这一差距可以通过使用宏基因组数据来填补:特别是,从 Sorcerer II 全球海洋采样探险中生成的最大现有数据集。将三种公开可用的 CRISPR 识别程序应用于全球海洋宏基因组产生了很大比例的假阳性结果。为了解决这个问题,设计了一种过滤程序。它产生了大约 200 个可靠的 CRISPR 盒,然后对其进行了详细研究。重复共识被聚类成几个稳定的类别,与现有的分类不同。与盒间隔相似的短 DNA 片段在同一地理位置比在其他位置更频繁出现(P ,<0.0001)。我们开发了一个基本的 CRISPR 形成事件目录,并重建了具有共同间隔的盒的可能进化历史。宏基因组收集允许相对无偏地分析噬菌体-宿主相互作用和 CRISPR 进化。这项研究的结果表明,CRISPR 盒保留了特定海洋位置局部病毒种群的记忆。CRISPR 进化可以使用具有自然生物学解释的有限词汇的基本事件来描述。