Centre for Computational Biology, Institute of Cancer and Genomic Sciences, University of Birmingham, Birmingham B15 2TT, UK.
Sir William Dunn School of Pathology, Oxford University, Oxford OX1 3RE, UK.
Nucleic Acids Res. 2019 Mar 18;47(5):2229-2243. doi: 10.1093/nar/gkz094.
DNA replication is a stochastic process with replication forks emanating from multiple replication origins. The origins must be licenced in G1, and the replisome activated at licenced origins in order to generate bi-directional replication forks in S-phase. Differential firing times lead to origin interference, where a replication fork from an origin can replicate through and inactivate neighbouring origins (origin obscuring). We developed a Bayesian algorithm to characterize origin firing statistics from Okazaki fragment (OF) sequencing data. Our algorithm infers the distributions of firing times and the licencing probabilities for three consecutive origins. We demonstrate that our algorithm can distinguish partial origin licencing and origin obscuring in OF sequencing data from Saccharomyces cerevisiae and human cell types. We used our method to analyse the decreased origin efficiency under loss of Rat1 activity in S. cerevisiae, demonstrating that both reduced licencing and increased obscuring contribute. Moreover, we show that robust analysis is possible using only local data (across three neighbouring origins), and analysis of the whole chromosome is not required. Our algorithm utilizes an approximate likelihood and a reversible jump sampling technique, a methodology that can be extended to analysis of other mechanistic processes measurable through Next Generation Sequencing data.
DNA 复制是一个随机过程,复制叉从多个复制起点出发。这些起点必须在 G1 期被许可,并且在许可的起点处激活复制体,以便在 S 期产生双向复制叉。不同的起始时间会导致起点干扰,即来自一个起点的复制叉可以通过并使相邻的起点失活(起点遮蔽)。我们开发了一种贝叶斯算法,从冈崎片段(OF)测序数据中描述起点激发的统计特性。我们的算法推断了三个连续起点的激发时间分布和许可概率。我们证明,我们的算法可以区分酿酒酵母和人类细胞类型的 OF 测序数据中的部分起点许可和起点遮蔽。我们使用该方法分析了 Rat1 活性丧失对酿酒酵母中起点效率的降低的影响,表明减少许可和增加遮蔽都会导致这种情况。此外,我们表明,仅使用局部数据(跨越三个相邻的起点)进行稳健分析是可能的,而不需要对整个染色体进行分析。我们的算法利用了近似似然和可逆跳跃抽样技术,该方法可以扩展到通过下一代测序数据测量的其他机制过程的分析。