Ri Myong-Rim, Kang Jin-Sok, Ri Myong-Ryong, U Song Nam
Department of Life Science, University of Science, Pyongyang, Democratic People's Republic of Korea.
Heliyon. 2023 Aug 1;9(8):e18829. doi: 10.1016/j.heliyon.2023.e18829. eCollection 2023 Aug.
Amplification and specificity of polymerase chain reaction (PCR) are affected by the position and type of primer-template mismatches (MMs) as well as various conditions of reaction. In this study, multiple linear regression (MLR) models and artificial neural network (ANN) models were developed for the prediction of the effects of primer-template mismatches on the primer extension efficiency in primer-template duplex. In MLR models, the independent variable representing the position effect of -th mismatch from 3' end of primers was normalized to values between 0 and 1 according to the size of , the difference of Gibbs free energy changes between the mismatch and its corresponding perfect-match, and other independent variables representing the position effect of the -th perfect-match from 3' end of primer were coded 1. A dependent variable of MLR model was relative extension efficiencies of primers. In ANN models, an input layer has neurons equal to the number of independent variables of corresponding MLR models and a hidden layer and an output layer have four and one neurons, respectively. Our MLR models and ANN models outperform the previous polynomial regression model for the prediction of the single base extension (SBE) efficiencies of single-MM primers. Especially, ANN model 6 which has 32 neurons representing the position effect of mismatch, the type of mismatch and the annealing temperature on primer-template duplex in the input layer can predict the SBE efficiencies of single-MM primers with a high accuracy, since its correlation coefficients R in training set, testing set and all data are 0.9870, 0.9782 and 0.9857, respectively. These results will have a good prospect applicable to the design of primer and testing the primer specificity in genome database.
聚合酶链反应(PCR)的扩增和特异性受引物-模板错配(MM)的位置和类型以及各种反应条件的影响。在本研究中,开发了多元线性回归(MLR)模型和人工神经网络(ANN)模型,用于预测引物-模板错配对偶联物中引物延伸效率的影响。在MLR模型中,代表引物3'端第个错配位置效应的自变量根据的大小归一化为0到1之间的值、错配与其相应完美匹配之间吉布斯自由能变化的差异,以及代表引物3'端第个完美匹配位置效应的其他自变量编码为1。MLR模型的因变量是引物的相对延伸效率。在ANN模型中,输入层的神经元数量等于相应MLR模型的自变量数量,隐藏层和输出层分别有四个和一个神经元。我们的MLR模型和ANN模型在预测单MM引物的单碱基延伸(SBE)效率方面优于先前的多项式回归模型。特别是,ANN模型6在输入层中有32个神经元,分别代表错配位置效应、错配类型和退火温度对引物-模板双链体的影响,能够高精度地预测单MM引物的SBE效率,因为其在训练集、测试集和所有数据中的相关系数R分别为0.9870、0.9782和0.9857。这些结果在引物设计和基因组数据库中引物特异性测试方面具有良好的应用前景。