Baumdicker F, Huebner A M I, Pfaffelhuber P
Department of Mathematical Stochastics, Albert-Ludwigs-University of Freiburg, Germany.
Department of Mathematical Stochastics, Albert-Ludwigs-University of Freiburg, Germany.
Theor Popul Biol. 2018 Feb;119:72-82. doi: 10.1016/j.tpb.2017.11.001. Epub 2017 Nov 22.
Today, the CRISPR (clustered regularly interspaced short palindromic repeats) region within bacterial and archaeal genomes is known to encode an adaptive immune system. We rely on previous results on the evolution of the CRISPR arrays, which led to the ordered independent loss model, introduced by Kupczok and Bollback (2013). When focusing on the spacers (between the repeats), new elements enter a CRISPR array at rate θ at the leader end of the array, while all spacers present are lost at rate ρ along the phylogeny relating the sample. Within this model, we compute the distribution of distances of spacers which are present in all arrays in a sample of size n. We use these results to estimate the loss rate ρ from spacer array data for n=2 and n=3.
如今,已知细菌和古细菌基因组中的CRISPR(成簇规律间隔短回文重复序列)区域编码一种适应性免疫系统。我们依赖于先前关于CRISPR阵列进化的研究结果,这些结果引出了由库普佐克和博尔巴克(2013年)提出的有序独立丢失模型。当聚焦于间隔序列(重复序列之间)时,新的元件以速率θ进入阵列前端的CRISPR阵列,而在与样本相关的系统发育过程中,所有现存的间隔序列以速率ρ丢失。在这个模型中,我们计算了大小为n的样本中所有阵列中都存在的间隔序列的距离分布。我们利用这些结果从n = 2和n = 3的间隔序列阵列数据中估计丢失率ρ。