Department of Pharmacy, Sultan Qaboos University Hospital, PO Box 38, Al Khod, Muscat 123, Oman.
Eur J Med Chem. 2010 Sep;45(9):4018-25. doi: 10.1016/j.ejmech.2010.05.059. Epub 2010 Jun 4.
The machine learning methods artificial neural network (ANN) and support vector machine (SVM) techniques were used to model intrinsic solubility of 74 generic drugs. The models obtained were compared with those obtained using multiple linear regression (MLR) analysis. Cluster analysis was used to split the data into a training set and test set. The appropriate descriptors were selected using a wrapper approach with multiple linear regressions as target learning algorithm. The descriptor selection and model building were performed with 10 fold cross validation using the training data set. The linear model fits the training set (n = 60) with R(2) = 0.814, while ANN and SVM higher values of R(2) = 0.823 and 0.835, respectively. Though the SVM model shows improvement of training set fitting, the ANN model was slightly superior to SVM and MLR in predicting the test set. The quantitative structure-property relationship study suggests that the theoretically calculated descriptors log P, first-order valence connectivity index ((1)chi(v)), delta chi (Delta(2)chi) and information content ((2)IC) have relevant relationships with intrinsic solubility of generic drugs studied.
机器学习方法人工神经网络 (ANN) 和支持向量机 (SVM) 技术被用于对 74 种通用药物的内在溶解度进行建模。所得模型与使用多元线性回归 (MLR) 分析得到的模型进行了比较。聚类分析将数据分为训练集和测试集。使用带有多元线性回归作为目标学习算法的包装方法选择合适的描述符。描述符选择和模型构建使用训练数据集进行 10 折交叉验证。线性模型拟合训练集 (n = 60),R(2) = 0.814,而 ANN 和 SVM 的 R(2) 值分别为 0.823 和 0.835 更高。虽然 SVM 模型显示出对训练集拟合的改进,但 ANN 模型在预测测试集方面略优于 SVM 和 MLR。定量构效关系研究表明,理论计算的描述符 log P、一阶价连接指数 ((1)chi(v))、Delta chi (Delta(2)chi) 和信息含量 ((2)IC) 与所研究的通用药物的内在溶解度具有相关关系。