Suppr超能文献

CRISPR 间隔区内容演化的概率模型。

Probabilistic models for CRISPR spacer content evolution.

机构信息

IST Austria (Institute of Science and Technology Austria), Klosterneuburg, Austria.

出版信息

BMC Evol Biol. 2013 Feb 26;13:54. doi: 10.1186/1471-2148-13-54.

Abstract

BACKGROUND

The CRISPR/Cas system is known to act as an adaptive and heritable immune system in Eubacteria and Archaea. Immunity is encoded in an array of spacer sequences. Each spacer can provide specific immunity to invasive elements that carry the same or a similar sequence. Even in closely related strains, spacer content is very dynamic and evolves quickly. Standard models of nucleotide evolution cannot be applied to quantify its rate of change since processes other than single nucleotide changes determine its evolution.

METHODS

We present probabilistic models that are specific for spacer content evolution. They account for the different processes of insertion and deletion. Insertions can be constrained to occur on one end only or are allowed to occur throughout the array. One deletion event can affect one spacer or a whole fragment of adjacent spacers. Parameters of the underlying models are estimated for a pair of arrays by maximum likelihood using explicit ancestor enumeration.

RESULTS

Simulations show that parameters are well estimated on average under the models presented here. There is a bias in the rate estimation when including fragment deletions. The models also estimate times between pairs of strains. But with increasing time, spacer overlap goes to zero, and thus there is an upper bound on the distance that can be estimated. Spacer content similarities are displayed in a distance based phylogeny using the estimated times.We use the presented models to analyze different Yersinia pestis data sets and find that the results among them are largely congruent. The models also capture the variation in diversity of spacers among the data sets. A comparison of spacer-based phylogenies and Cas gene phylogenies shows that they resolve very different time scales for this data set.

CONCLUSIONS

The simulations and data analyses show that the presented models are useful for quantifying spacer content evolution and for displaying spacer content similarities of closely related strains in a phylogeny. This allows for comparisons of different CRISPR arrays or for comparisons between CRISPR arrays and nucleotide substitution rates.

摘要

背景

CRISPR/Cas 系统在真细菌和古菌中充当适应性和遗传性免疫系统。免疫由一系列间隔序列编码。每个间隔序列都可以为携带相同或相似序列的入侵元件提供特异性免疫。即使在密切相关的菌株中,间隔序列的内容也非常动态,且快速进化。由于除了单核苷酸变化之外的其他过程决定了其进化,因此不能应用标准的核苷酸进化模型来量化其变化率。

方法

我们提出了特定于间隔序列内容进化的概率模型。它们考虑了插入和缺失的不同过程。插入可以仅在一端发生,也可以在整个阵列中发生。一次删除事件可以影响一个间隔序列或一整段相邻间隔序列。使用显式祖先枚举通过最大似然法对模型的参数进行估计。

结果

模拟表明,参数在本文提出的模型下平均得到了很好的估计。在包含片段缺失的情况下,速率估计存在偏差。这些模型还可以估计两个菌株之间的时间。但是,随着时间的增加,间隔序列重叠趋于零,因此存在估计距离的上限。使用估计的时间,在基于距离的系统发育树上显示了间隔序列的相似性。我们使用所提出的模型来分析不同的鼠疫耶尔森氏菌数据集,并发现它们的结果在很大程度上是一致的。该模型还捕获了数据集中间隔序列多样性的变化。基于间隔序列的系统发育树和 Cas 基因系统发育树的比较表明,对于这个数据集,它们解析了非常不同的时间尺度。

结论

模拟和数据分析表明,所提出的模型可用于量化间隔序列内容的进化,并在系统发育树上显示密切相关菌株的间隔序列相似性。这允许对不同的 CRISPR 阵列进行比较,或者对 CRISPR 阵列和核苷酸取代率进行比较。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9a0b/3704272/4e8f4ee0c5f4/1471-2148-13-54-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验