Doi K, Imai H
Genome Inform Ser Workshop Genome Inform. 1997;8:43-52.
Selecting a good collection of primers is very important for polymerase chain reaction (PCR) experiments. Most existing algorithms for primer selection are concerned with computing a primer pair for each DNA sequence. In generalizing the arbitrarily primed PCR, etc., to the case that all DNA sequences of target objects are already known, like about 6000 ORFs of yeast, we may design a small set of primers so that all the targets are PCR amplified and resolved electrophoretically in a series of experiments. This is quite useful because deceasing the number of primers greatly reduces the cost of experiments. Pearson et al. (ISMB 1995: 285-291, 1995; Discrete Appl. Math. 71: 231-246, 1996) consider finding a minimum set of primers covering all given DNA sequences, but their method does not meet necessary biological conditions such as primer amplification and electrophoresis resolution. In this paper, based on the modeling and computational complexity analysis by Doi, we propose algorithms for this primer selection problem. These algorithms do not necessarily minimize the number of primers, but, since basic versions of these problems are shown to be computationally intractable, especially even for approximability with the length resolution condition, this is inevitable. In the algorithms, the amplification condition by a primer pair and the length resolution condition by electrophoresis are incorporated. These algorithms are based on the theoretically well-founded greedy algorithm for the set cover in computer science. Preliminary computational results are presented to show the validity of this approach. The number of computed primers is much less than a half of the number of targets, and hence is less than one forth of the number needed in the multiplex PCR.
对于聚合酶链反应(PCR)实验而言,选择一组优良的引物非常重要。大多数现有的引物选择算法都专注于为每个DNA序列计算一对引物。在将任意引物PCR等方法推广到目标对象的所有DNA序列都已知的情况时,比如酵母的约6000个开放阅读框(ORF),我们可以设计一小组引物,以便在一系列实验中对所有目标进行PCR扩增并通过电泳分离。这非常有用,因为减少引物数量能大幅降低实验成本。Pearson等人(《ISMB 1995:285 - 291,1995》;《离散应用数学》71:231 - 246,1996)考虑寻找覆盖所有给定DNA序列的最小引物集,但他们的方法不符合引物扩增和电泳分辨率等必要的生物学条件。在本文中,基于Doi的建模和计算复杂性分析,我们针对这个引物选择问题提出了算法。这些算法不一定能使引物数量最小化,但由于这些问题的基本版本已被证明在计算上是棘手的,特别是即使对于具有长度分辨率条件的近似性也是如此,所以这是不可避免的。在这些算法中,纳入了引物对的扩增条件和电泳的长度分辨率条件。这些算法基于计算机科学中关于集合覆盖的理论基础扎实的贪心算法。给出了初步的计算结果以证明这种方法的有效性。计算出的引物数量远少于目标数量的一半,因此少于多重PCR所需数量的四分之一。