Huang Wenkang, Geng Lv, Deng Rong, Lu Shaoyong, Ma Guangli, Yu Jianxiu, Zhang Jian, Liu Wei, Hou Tingjun, Lu Xuefeng
Department of Pathophysiology, Key Laboratory of Cell Differentiation and Apoptosis of Chinese Ministry of Education, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China.
Department of Biochemistry and Molecular Cell Biology & Shanghai Key Laboratory of Tumor Microenvironment and Inflammation, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China.
Chem Biol Drug Des. 2015 Nov;86(5):990-7. doi: 10.1111/cbdd.12567. Epub 2015 Apr 28.
Human clearance is often predicted prior to clinical study from in vivo preclinical data by virtue of interspecies allometric scaling methods. The aims of this study were to determine the important molecular descriptors for the extrapolation of animal data to human clearance and further to build a model to predict human clearance by combination of animal data and the selected molecular descriptors. These important molecular descriptors selected by genetic algorithm (GA) were from five classes: quantum mechanical, shadow indices, E-state keys, molecular properties, and molecular property counts. Although the data set contained many outliers determined by the conventional Mahmood method, the variation of most outliers was reduced significantly by our final support vector machine (SVM) model. The values of cross-validated correlation coefficient and root-mean-squared error (RMSE) for leave-one-out cross-validation (LOOCV) of the final SVM model were 0.783 and 0.305, respectively. Meanwhile, the reliability and consistency of the final model were also validated by an external test set. In conclusion, the SVM model based on the molecular descriptors selected by GA and animal data achieved better prediction performance than the Mahmood method. This approach can be applied as an improved interspecies allometric scaling method in drug research and development.
在临床研究之前,通常借助种间异速生长比例缩放方法,根据体内临床前数据来预测人体清除率。本研究的目的是确定将动物数据外推至人体清除率的重要分子描述符,并进一步构建一个模型,通过结合动物数据和选定的分子描述符来预测人体清除率。通过遗传算法(GA)选择的这些重要分子描述符来自五个类别:量子力学、影子指数、E态键、分子性质和分子性质计数。尽管数据集包含许多由传统Mahmood方法确定的异常值,但我们最终的支持向量机(SVM)模型显著降低了大多数异常值的变化。最终SVM模型留一法交叉验证(LOOCV)的交叉验证相关系数和均方根误差(RMSE)值分别为0.783和0.305。同时,最终模型的可靠性和一致性也通过外部测试集得到了验证。总之,基于GA选择的分子描述符和动物数据的SVM模型比Mahmood方法具有更好的预测性能。这种方法可作为药物研发中一种改进的种间异速生长比例缩放方法应用。