Khosrokhavar Roya, Ghasemi Jahan Bakhsh, Shiri Fereshteh
Food and Drug Laboratory Research Center, MOH & ME, Tehran, Iran; E-Mail:
Int J Mol Sci. 2010 Aug 31;11(9):3052-68. doi: 10.3390/ijms11093052.
In the present work, support vector machines (SVMs) and multiple linear regression (MLR) techniques were used for quantitative structure-property relationship (QSPR) studies of retention time (t(R)) in standardized liquid chromatography-UV-mass spectrometry of 67 mycotoxins (aflatoxins, trichothecenes, roquefortines and ochratoxins) based on molecular descriptors calculated from the optimized 3D structures. By applying missing value, zero and multicollinearity tests with a cutoff value of 0.95, and genetic algorithm method of variable selection, the most relevant descriptors were selected to build QSPR models. MLR and SVMs methods were employed to build QSPR models. The robustness of the QSPR models was characterized by the statistical validation and applicability domain (AD). The prediction results from the MLR and SVM models are in good agreement with the experimental values. The correlation and predictability measure by r(2) and q(2) are 0.931 and 0.932, repectively, for SVM and 0.923 and 0.915, respectively, for MLR. The applicability domain of the model was investigated using William's plot. The effects of different descriptors on the retention times are described.
在本研究中,基于从优化的三维结构计算得到的分子描述符,使用支持向量机(SVM)和多元线性回归(MLR)技术对67种霉菌毒素(黄曲霉毒素、单端孢霉烯族毒素、罗克福汀和赭曲霉毒素)在标准化液相色谱 - 紫外 - 质谱中的保留时间(t(R))进行定量结构 - 性质关系(QSPR)研究。通过应用缺失值、零值和多重共线性检验(截止值为0.95)以及遗传算法变量选择方法,选择了最相关的描述符来构建QSPR模型。采用MLR和SVM方法构建QSPR模型。通过统计验证和适用域(AD)对QSPR模型的稳健性进行了表征。MLR和SVM模型的预测结果与实验值高度吻合。SVM模型的r(2)和q(2)相关性和预测性指标分别为0.931和0.932,MLR模型分别为0.923和0.915。使用威廉姆图研究了模型的适用域。描述了不同描述符对保留时间的影响。