Nohair Mohamed, Zakarya Driss
Department of Chemistry, Faculty of Sciences and Techniques, UFR of Applied Chemistry, B.P. 146, 20450 Mohammadia, Morocco.
J Mol Model. 2003 Dec;9(6):365-71. doi: 10.1007/s00894-003-0137-x. Epub 2003 Aug 23.
Structure-water solubility modeling of aliphatic alcohols was performed using the multifunctional autocorrelation method. The molecule is represented by using a set of parameters describing global molecules, and others that take the structural environment of the edge O-C into account. Multiple linear regression (MLR) and multilayer feed-forward artificial neural network architectures are utilized to construct linear and nonlinear QSPR models, respectively. The optimal QSPR model was developed based on a 4-4-1 neural network architecture. The efficiency of the approach is demonstrated through the predictive ability of the ANN and MLR models by the leave-20%-out (L20%O) cross-validation method, demonstrating that the neural model is more reliable than that obtained using MLR. The root mean square errors in the solubility prediction (ln SOL) for the calibration and predictive models were 0.13 and 0.18 respectively. On the other hand, we tested four activation functions: the hyperbolic tangent, sigmoid function or Gaussian functions for the hidden layer and a linear, sigmoid, hyperbolic tangent or Gaussian function for the output layer. The influence and the contribution of each type of descriptor in the model is examined. After omission of a set of descriptors, we calculate the error for the solubility and classify them into discrete categories. The standard error and the percentage of the prediction in the precision interval considered have been estimated. The results imply that the solubility of aliphatic alcohols is dominated by the shape and branching of the molecule. The hydrogen-bonding interactions caused by the C-OH group seem to be a less important factor influencing the solubility. The model was compared with other models; especially that using weighted path numbers, which is considered to be the most accurate QSPR model for predicting the water solubility of aliphatic alcohols.
采用多功能自相关方法对脂肪醇进行结构-水溶性建模。通过使用一组描述全局分子的参数以及考虑边缘O-C结构环境的其他参数来表示分子。分别利用多元线性回归(MLR)和多层前馈人工神经网络架构构建线性和非线性QSPR模型。基于4-4-1神经网络架构开发了最优QSPR模型。通过留20%法(L20%O)交叉验证方法,利用人工神经网络(ANN)和MLR模型的预测能力证明了该方法的有效性,表明神经模型比使用MLR得到的模型更可靠。校准模型和预测模型在溶解度预测(ln SOL)方面的均方根误差分别为0.13和0.18。另一方面,我们测试了四种激活函数:隐藏层使用双曲正切、Sigmoid函数或高斯函数,输出层使用线性、Sigmoid、双曲正切或高斯函数。研究了模型中每种描述符的影响和贡献。在省略一组描述符后,我们计算溶解度误差并将其分类为离散类别。估计了标准误差和在考虑的精度区间内预测的百分比。结果表明,脂肪醇的溶解度主要由分子的形状和支化决定。由C-OH基团引起的氢键相互作用似乎是影响溶解度的较不重要因素。将该模型与其他模型进行了比较;特别是与使用加权路径数的模型进行了比较,该模型被认为是预测脂肪醇水溶性最准确的QSPR模型。