Department of Computational Biology and Applied Algorithmics, Max Planck Institute for Informatics, Saarland Informatics Campus, 66123, Saarbrücken, Germany.
Institute of Virology, University of Cologne, Fürst-Pückler-Str. 56, 50935, Cologne, Germany.
Sci Rep. 2019 Jul 24;9(1):10748. doi: 10.1038/s41598-019-47173-w.
Successful primer design for polymerase chain reaction (PCR) hinges on the ability to identify primers that efficiently amplify template sequences. Here, we generated a novel Taq PCR data set that reports the amplification status for pairs of primers and templates from a reference set of 47 immunoglobulin heavy chain variable sequences and 20 primers. Using logistic regression, we developed TMM, a model for predicting whether a primer amplifies a template given their nucleotide sequences. The model suggests that the free energy of annealing, ΔG, is the key driver of amplification (p = 7.35e-12) and that 3' mismatches should be considered in dependence on ΔG and the mismatch closest to the 3' terminus (p = 1.67e-05). We validated TMM by comparing its estimates with those from the thermodynamic model of DECIPHER (DE) and a model based solely on the free energy of annealing (FE). TMM outperformed the other approaches in terms of the area under the receiver operating characteristic curve (TMM: 0.953, FE: 0.941, DE: 0.896). TMM can improve primer design and is freely available via openPrimeR ( http://openPrimeR.mpi-inf.mpg.de ).
成功的聚合酶链反应 (PCR) 引物设计取决于识别能够有效扩增模板序列的引物的能力。在这里,我们生成了一个新的 Taq PCR 数据集,报告了来自 47 个免疫球蛋白重链可变序列和 20 个引物的参考集的引物和模板的扩增状态。使用逻辑回归,我们开发了 TMM,这是一种用于预测给定其核苷酸序列的引物是否扩增模板的模型。该模型表明,退火自由能 ΔG 是扩增的关键驱动因素(p=7.35e-12),并且 3' 错配应根据 ΔG 和最接近 3' 末端的错配来考虑(p=1.67e-05)。我们通过将 TMM 的估计值与热力学模型 DECIPHER(DE)和仅基于退火自由能的模型的估计值进行比较来验证 TMM。TMM 在接受者操作特征曲线下的面积方面优于其他方法(TMM:0.953,FE:0.941,DE:0.896)。TMM 可以改进引物设计,并可通过 openPrimeR(http://openPrimeR.mpi-inf.mpg.de)免费获得。