Waterman M S
Nucleic Acids Res. 1983 Dec 20;11(24):8951-6. doi: 10.1093/nar/11.24.8951.
Restriction sites or other sequence patterns are usually assumed to occur according to a Poisson distribution with mean equal to the reciprocal of the probability of the given site or pattern. For situations where non-overlapping occurrences of patterns, such as restriction sites, are the objects of interest, this note shows that the Poisson assumption is frequently misleading. Both the case of base composition (independent bases) and of dinucleotide frequencies (Markov chains) are treated. Moreover, a new technique is presented which allows treatment of collections of patterns, where the departure from the Poisson assumption is even more striking. This later case includes double digests, and an example of a five enzyme digest is included.
通常假定限制酶切位点或其他序列模式按照泊松分布出现,其均值等于给定位点或模式的概率的倒数。对于诸如限制酶切位点等模式的非重叠出现情况是研究对象的情形,本笔记表明泊松假设常常会产生误导。本文同时处理了碱基组成(独立碱基)和二核苷酸频率(马尔可夫链)的情况。此外,还提出了一种新技术,该技术可用于处理模式集合,在这种情况下偏离泊松假设的情况更为显著。后一种情况包括双酶切,并且包含了一个五酶切的例子。