An Tianbo, Chen Yueren, Chen Yefeng, Ma Leyu, Wang Jingrui, Zhao Jian
College of Network Security, Changchun University, Changchun, Jilin, China.
Institute of Education, Xiamen University, Xiamen, Fujian, China.
Front Genet. 2023 Jan 4;13:1087273. doi: 10.3389/fgene.2022.1087273. eCollection 2022.
By predicting ERα bioactivity and mining the potential relationship between Absorption, Distribution, Metabolism, Excretion, Toxicity (ADMET) attributes in drug research and development, the development efficiency of specific drugs for breast cancer will be effectively improved and the misjudgment rate of R&D personnel will be reduced. The quantitative prediction model of ERα bioactivity and classification prediction model of Absorption, Distribution, Metabolism, Excretion, Toxicity properties were constructed. The prediction results of ERα bioactivity were compared by XGBoot, Light GBM, Random Forest and MLP neural network. Two models with high prediction accuracy were selected and fused to obtain ERα bioactivity prediction model from Mean absolute error (MAE), mean squared error (MSE) and R2. The data were further subjected to model-based feature selection and FDR/FPR-based feature selection, respectively, and the results were placed in a voting machine to obtain Absorption, Distribution, Metabolism, Excretion, Toxicity classification prediction model. In this study, 430 molecular descriptors were removed, and finally 20 molecular descriptors with the most significant effect on biological activity obtained by the dual feature screening combined optimization method were used to establish a compound molecular descriptor prediction model for ERα biological activity, and further classification and prediction of the Absorption, Distribution, Metabolism, Excretion, Toxicity properties of the drugs were made. Eighty variables were selected by the model ExtraTreesClassifier Classifie, and 40 variables were selected by the model GradientBoostingClassifier to complete the model-based feature selection. At the same time, the feature selection method based on FDR/FPR is also selected, and the three classification models obtained by the two methods are placed into the voting machine to obtain the final model. The experimental results showed that the model's evaluation indexes and roc diagram were excellent and could accurately predict ERα bioactivity and Absorption, Distribution, Metabolism, Excretion, Toxicity properties. The model constructed in this study has high accuracy, fast convergence and robustness, has a very high accuracy for Absorption, Distribution, Metabolism, Excretion, Toxicity and ERα classification prediction, has bright prospects in the biopharmaceutical field, and is an important method for energy conservation and yield increase in the future.
通过预测雌激素受体α(ERα)的生物活性,并挖掘药物研发中吸收、分布、代谢、排泄、毒性(ADMET)属性之间的潜在关系,将有效提高乳腺癌特异性药物的研发效率,并降低研发人员的误判率。构建了ERα生物活性的定量预测模型以及ADMET属性的分类预测模型。通过XGBoot、Light GBM、随机森林和MLP神经网络对ERα生物活性的预测结果进行比较。选择两个预测准确率高的模型进行融合,从平均绝对误差(MAE)、均方误差(MSE)和R2获得ERα生物活性预测模型。数据分别进一步进行基于模型的特征选择和基于FDR/FPR的特征选择,并将结果放入投票机中以获得ADMET分类预测模型。在本研究中,去除了430个分子描述符,最终使用通过双重特征筛选组合优化方法获得的对生物活性影响最显著的20个分子描述符建立了ERα生物活性的化合物分子描述符预测模型,并进一步对药物的ADMET属性进行分类和预测。通过ExtraTreesClassifier分类器模型选择了80个变量,通过GradientBoostingClassifier模型选择了40个变量以完成基于模型的特征选择。同时,还选择了基于FDR/FPR的特征选择方法,并将通过两种方法获得的三个分类模型放入投票机中以获得最终模型。实验结果表明,该模型的评估指标和roc图表现优异,能够准确预测ERα生物活性和ADMET属性。本研究构建的模型具有高精度、快速收敛和鲁棒性,对ADMET和ERα分类预测具有非常高的准确率,在生物制药领域具有广阔前景,是未来节能增产的重要方法。