Walters K
Division of Genomic Medicine, School of Medicine and Biomedical Sciences, University of Sheffield, Sheffield S10 2RX, UK.
J Theor Biol. 2007 Mar 7;245(1):161-8. doi: 10.1016/j.jtbi.2006.09.028. Epub 2006 Oct 1.
Metabolites and certain chemical agents (for example methyl methanesulfonate) can induce nucleotide bases on chromosomal strands to become alkylated. These alkylated sites have the potential to become single-strand chromosomal breaks, a form of DNA damage, if they are exposed to a sufficient temperature in vitro. It has been proposed that a single-strand break (SSB) sufficiently close to another SSB on the opposite chromosomal strand will form a double-strand break (DSB). DNA repair mechanisms are less able to repair DSBs compared to SSBs. Because of the complex three-dimensional structure of DNA, some chromosomal regions are more susceptible to alkylation than others. A question of interest is therefore whether these alkylated bases are randomly distributed or tend to be clustered. Pulsed-field gel electrophoresis allows the number of DNA fragments (and hence the number of DSBs) to be observed directly. The randomness of alkylation events can therefore be tested using the standard statistical hypothesis-testing framework. Under the null hypothesis, that the SSBs are randomly distributed on each of the strands, we can calculate the probability of observing a number of DSBs at least as large as that observed and hence the associated p-value. Previously, the probability distribution of the number of DSBs has been determined by Monte Carlo simulations; when considering the whole genome this can be very time consuming. In this paper, we theoretically derive an approximation to the distribution enabling appropriate probabilities to be calculated quickly. Based on previous findings we assume that the number of breaks on each strand is small compared to the number of nucleotide bases. We show that our method can give the correct probability distribution when alkylation events are relatively rare, discuss how rare these events have to be and suggest potential extensions to the model when a greater proportion of bases are alkylated.
代谢物和某些化学试剂(例如甲磺酸甲酯)可诱导染色体链上的核苷酸碱基发生烷基化。如果这些烷基化位点在体外暴露于足够高的温度下,它们有可能成为单链染色体断裂,这是一种DNA损伤形式。有人提出,在相反染色体链上与另一个单链断裂(SSB)足够接近的单链断裂会形成双链断裂(DSB)。与单链断裂相比,DNA修复机制修复双链断裂的能力较弱。由于DNA复杂的三维结构,一些染色体区域比其他区域更容易发生烷基化。因此,一个有趣的问题是这些烷基化碱基是随机分布还是倾向于聚集。脉冲场凝胶电泳可以直接观察到DNA片段的数量(从而观察到双链断裂的数量)。因此,可以使用标准的统计假设检验框架来测试烷基化事件的随机性。在零假设下,即单链断裂在每条链上随机分布,我们可以计算观察到至少与实际观察到的双链断裂数量一样多的概率,从而得到相关的p值。以前,双链断裂数量的概率分布是通过蒙特卡罗模拟确定的;当考虑整个基因组时,这可能非常耗时。在本文中,我们从理论上推导出该分布的一个近似值,以便能够快速计算出适当的概率。基于以前的研究结果,我们假设每条链上的断裂数量与核苷酸碱基的数量相比很小。我们表明,当烷基化事件相对罕见时,我们的方法可以给出正确的概率分布,讨论了这些事件必须有多罕见,并提出了在更大比例的碱基被烷基化时模型的潜在扩展。