通过对序列特异性特征的机器学习来模拟免疫球蛋白的扩增。

Modeling the Amplification of Immunoglobulins through Machine Learning on Sequence-Specific Features.

机构信息

Department of Computational Biology and Applied Algorithmics, Max Planck Institute for Informatics, Saarland Informatics Campus, 66123, Saarbrücken, Germany.

Institute of Virology, University of Cologne, Fürst-Pückler-Str. 56, 50935, Cologne, Germany.

出版信息

Sci Rep. 2019 Jul 24;9(1):10748. doi: 10.1038/s41598-019-47173-w.

DOI:10.1038/s41598-019-47173-w

PMID:31341211

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6656877/

Abstract

Successful primer design for polymerase chain reaction (PCR) hinges on the ability to identify primers that efficiently amplify template sequences. Here, we generated a novel Taq PCR data set that reports the amplification status for pairs of primers and templates from a reference set of 47 immunoglobulin heavy chain variable sequences and 20 primers. Using logistic regression, we developed TMM, a model for predicting whether a primer amplifies a template given their nucleotide sequences. The model suggests that the free energy of annealing, ΔG, is the key driver of amplification (p = 7.35e-12) and that 3' mismatches should be considered in dependence on ΔG and the mismatch closest to the 3' terminus (p = 1.67e-05). We validated TMM by comparing its estimates with those from the thermodynamic model of DECIPHER (DE) and a model based solely on the free energy of annealing (FE). TMM outperformed the other approaches in terms of the area under the receiver operating characteristic curve (TMM: 0.953, FE: 0.941, DE: 0.896). TMM can improve primer design and is freely available via openPrimeR ( http://openPrimeR.mpi-inf.mpg.de ).

摘要

成功的聚合酶链反应 (PCR) 引物设计取决于识别能够有效扩增模板序列的引物的能力。在这里，我们生成了一个新的 Taq PCR 数据集，报告了来自 47 个免疫球蛋白重链可变序列和 20 个引物的参考集的引物和模板的扩增状态。使用逻辑回归，我们开发了 TMM，这是一种用于预测给定其核苷酸序列的引物是否扩增模板的模型。该模型表明，退火自由能 ΔG 是扩增的关键驱动因素（p=7.35e-12），并且 3' 错配应根据 ΔG 和最接近 3' 末端的错配来考虑（p=1.67e-05）。我们通过将 TMM 的估计值与热力学模型 DECIPHER（DE）和仅基于退火自由能的模型的估计值进行比较来验证 TMM。TMM 在接受者操作特征曲线下的面积方面优于其他方法（TMM：0.953，FE：0.941，DE：0.896）。TMM 可以改进引物设计，并可通过 openPrimeR（http://openPrimeR.mpi-inf.mpg.de）免费获得。