Rachlin John, Ding Chunming, Cantor Charles, Kasif Simon
Bioinformatics program, Boston University, MA 02215, USA.
BMC Genomics. 2005 Jul 25;6:102. doi: 10.1186/1471-2164-6-102.
Multiplex PCR is a key technology for detecting infectious microorganisms, whole-genome sequencing, forensic analysis, and for enabling flexible yet low-cost genotyping. However, the design of a multiplex PCR assays requires the consideration of multiple competing objectives and physical constraints, and extensive computational analysis must be performed in order to identify the possible formation of primer-dimers that can negatively impact product yield.
This paper examines the computational design limits of multiplex PCR in the context of SNP genotyping and examines tradeoffs associated with several key design factors including multiplexing level (the number of primer pairs per tube), coverage (the % of SNP whose associated primers are actually assigned to one of several available tube), and tube-size uniformity. We also examine how design performance depends on the total number of available SNPs from which to choose, and primer stringency criterial. We show that finding high-multiplexing/high-coverage designs is subject to a computational phase transition, becoming dramatically more difficult when the probability of primer pair interaction exceeds a critical threshold. The precise location of this critical transition point depends on the number of available SNPs and the level of multiplexing required. We also demonstrate how coverage performance is impacted by the number of available snps, primer selection criteria, and target multiplexing levels.
The presence of a phase transition suggests limits to scaling Multiplex PCR performance for high-throughput genomics applications. Achieving broad SNP coverage rapidly transitions from being very easy to very hard as the target multiplexing level (# of primer pairs per tube) increases. The onset of a phase transition can be "delayed" by having a larger pool of SNPs, or loosening primer selection constraints so as to increase the number of candidate primer pairs per SNP, though the latter may produce other adverse effects. The resulting design performance tradeoffs define a benchmark that can serve as the basis for comparing competing multiplex PCR design optimization algorithms and can also provide general rules-of-thumb to experimentalists seeking to understand the performance limits of standard multiplex PCR.
多重聚合酶链式反应(Multiplex PCR)是检测感染性微生物、全基因组测序、法医分析以及实现灵活且低成本基因分型的关键技术。然而,多重PCR检测的设计需要考虑多个相互竞争的目标和物理限制,并且必须进行广泛的计算分析,以识别可能对产物产量产生负面影响的引物二聚体的形成。
本文在单核苷酸多态性(SNP)基因分型的背景下研究了多重PCR的计算设计限制,并研究了与几个关键设计因素相关的权衡,包括多重水平(每管引物对的数量)、覆盖率(其相关引物实际分配到几个可用管之一的SNP的百分比)和管大小均匀性。我们还研究了设计性能如何取决于可供选择的可用SNP的总数以及引物严格标准。我们表明,找到高多重/高覆盖率的设计会经历一个计算相变,当引物对相互作用的概率超过临界阈值时,难度会急剧增加。这个临界转变点的确切位置取决于可用SNP的数量和所需的多重水平。我们还展示了覆盖率性能如何受到可用SNP数量、引物选择标准和目标多重水平的影响。
相变的存在表明了高通量基因组学应用中多重PCR性能扩展的限制。随着目标多重水平(每管引物对的数量)的增加,实现广泛的SNP覆盖从非常容易迅速转变为非常困难。通过拥有更大的SNP库或放宽引物选择约束以增加每个SNP的候选引物对数量,可以“延迟”相变的开始,尽管后者可能会产生其他不利影响。由此产生的设计性能权衡定义了一个基准,可作为比较竞争性多重PCR设计优化算法的基础,也可为试图理解标准多重PCR性能限制的实验人员提供一般经验法则。