Doucet J P, Papa E, Doucet-Panaye A, Devillers J
a ITODYS, Paris-Diderot University , UMR 7086, Paris , France.
b QSAR Research Unit in Environmental Chemistry and Ecotoxicology, Department of Theoretical and Applied Science , University of Insubria , Varese , Italy.
SAR QSAR Environ Res. 2017 Jun;28(6):451-470. doi: 10.1080/1062936X.2017.1328855. Epub 2017 Jun 12.
QSAR models are proposed for predicting the toxicity of 33 piperidine derivatives against Aedes aegypti. From 2D topological descriptors, calculated with the PaDEL software, ordinary least squares multilinear regression (OLS-MLR) treatment from the QSARINS software and machine learning and related approaches including linear and radial support vector machine (SVM), projection pursuit regression (PPR), radial basis function neural network (RBFNN), general regression neural network (GRNN) and k-nearest neighbours (k-NN), led to four-variable models. Their robustness and predictive ability were evaluated through both internal and external validation. Determination coefficients (r) greater than 0.85 on the training sets and 0.8 on the test sets were obtained with OLS-MLR and linear SVM. They slightly outperform PPR, radial SVM and RBFNN, whereas GRNN and k-NN showed lower performance. The easy availability of the involved structural descriptors and the simplicity of the MLR model make the corresponding model attractive at an exploratory level for proposing, from this limited dataset, guidelines in the design of new potentially active molecules.
提出了定量构效关系(QSAR)模型,用于预测33种哌啶衍生物对埃及伊蚊的毒性。通过PaDEL软件计算二维拓扑描述符,利用QSARINS软件进行普通最小二乘多元线性回归(OLS-MLR)处理,并采用机器学习及相关方法,包括线性和径向支持向量机(SVM)、投影寻踪回归(PPR)、径向基函数神经网络(RBFNN)、广义回归神经网络(GRNN)和k近邻(k-NN),得到了四变量模型。通过内部和外部验证对其稳健性和预测能力进行了评估。OLS-MLR和线性SVM在训练集上的决定系数(r)大于0.85,在测试集上大于0.8。它们略优于PPR、径向SVM和RBFNN,而GRNN和k-NN表现较差。所涉及的结构描述符易于获取,且MLR模型简单,使得相应模型在探索层面具有吸引力,可从这个有限的数据集中为新的潜在活性分子设计提供指导。