• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于机器学习的苯并咪唑衍生物作为缓蚀剂的 QSAR 模型,综合特征选择。

A Machine Learning-Based QSAR Model for Benzimidazole Derivatives as Corrosion Inhibitors by Incorporating Comprehensive Feature Selection.

机构信息

Research Institute of Natural Gas Technology, Petro China Southwest Oil and Gas Field Company, Chengdu, 610213, China.

College of Chemistry, Sichuan University, Chengdu, Sichuan, 610064, People's Republic of China.

出版信息

Interdiscip Sci. 2019 Dec;11(4):738-747. doi: 10.1007/s12539-019-00346-7. Epub 2019 Sep 4.

DOI:10.1007/s12539-019-00346-7
PMID:31486019
Abstract

BACKGROUND

Computational prediction of inhibition efficiency (IE) for inhibitor molecules is a crucial supplementary way to design novel molecules that can efficiently inhibit corrosion onto metallic surfaces.

PURPOSE

Here we are dedicated to developing a new machine learning-based predictor for the inhibition efficiency (IE) of benzimidazole derivatives.

METHODS

First, a comprehensively numerical representation was given on inhibitor molecules from all aspects of energy, electronic, topological, physicochemical and spatial properties based on 3-D structures and 150 valid structural descriptors were obtained. Then, a thorough investigation of these structural descriptors was implemented. The multicollinearity-based clustering analysis was performed to remove the linear correlated feature variables, so 47 feature clusters were produced. Meanwhile, Gini importance by random forest (RF) was used to further measure the contributions of the descriptors in each cluster and 47 non-linear descriptors were selected with the highest Gini importance score in the corresponding cluster. Further, considering the limited number of available inhibitors, different feature subsets were constructed according to the Gini importance score ranking list of 47 descriptors.

RESULTS

Finally, support vector machine (SVM) models based on different feature subsets were tested by leave-one-out cross validation. Through comparisons, the optimal SVM model with the top 11 descriptors was achieved based on Poly kernel. This model yields a promising performance with the correlation coefficient (R) and root-mean-square error (RMSE) of 0.9589 and 4.45, respectively, which indicates that the method proposed by us gives the best performance for the current data.

CONCLUSION

Based on our model, 6 new benzimidazole molecules were designed and their IE values predicted by this model indicate that two of them have high potential as outstanding corrosion inhibitors.

摘要

背景

计算抑制剂分子抑制效率(IE)的预测是设计能够有效抑制金属表面腐蚀的新型分子的重要补充方法。

目的

本文致力于开发一种基于机器学习的苯并咪唑衍生物抑制效率(IE)的新预测器。

方法

首先,基于 3D 结构和 150 个有效结构描述符,从能量、电子、拓扑、物理化学和空间性质等方面全面数值表示抑制剂分子。然后,对这些结构描述符进行了深入研究。基于多重共线性的聚类分析用于去除线性相关的特征变量,从而产生了 47 个特征簇。同时,使用随机森林(RF)的基尼重要性进一步衡量每个簇中描述符的贡献,并选择相应簇中基尼重要性得分最高的 47 个非线性描述符。此外,考虑到可用抑制剂数量有限,根据 47 个描述符的基尼重要性得分排序表构建了不同的特征子集。

结果

最后,通过留一交叉验证测试了基于不同特征子集的支持向量机(SVM)模型。通过比较,基于 Poly 核的前 11 个描述符的最优 SVM 模型取得了较好的性能,相关系数(R)和均方根误差(RMSE)分别为 0.9589 和 4.45,表明我们提出的方法对当前数据具有最佳性能。

结论

基于我们的模型,设计了 6 种新的苯并咪唑分子,并通过该模型预测了它们的 IE 值,其中两种具有作为优秀腐蚀抑制剂的高潜力。

相似文献

1
A Machine Learning-Based QSAR Model for Benzimidazole Derivatives as Corrosion Inhibitors by Incorporating Comprehensive Feature Selection.基于机器学习的苯并咪唑衍生物作为缓蚀剂的 QSAR 模型,综合特征选择。
Interdiscip Sci. 2019 Dec;11(4):738-747. doi: 10.1007/s12539-019-00346-7. Epub 2019 Sep 4.
2
Prediction of Anti-proliferation Effect of [1,2,3]Triazolo[4,5-d]pyrimidine Derivatives by Random Forest and Mix-Kernel Function SVM with PSO.通过随机森林和混合核函数 SVM 与 PSO 预测[1,2,3]三唑并[4,5-d]嘧啶衍生物的抗增殖作用。
Chem Pharm Bull (Tokyo). 2022 Oct 1;70(10):684-693. doi: 10.1248/cpb.c22-00376. Epub 2022 Aug 2.
3
Performance comparison of nonlinear and linear regression algorithms coupled with different attribute selection methods for quantitative structure - retention relationships modelling in micellar liquid chromatography.胶束液相色谱中非线性和线性回归算法与不同属性选择方法相结合的定量结构 - 保留关系建模的性能比较。
J Chromatogr A. 2020 Jul 19;1623:461146. doi: 10.1016/j.chroma.2020.461146. Epub 2020 Apr 29.
4
Predicting Inhibitors for Multidrug Resistance Associated Protein-2 Transporter by Machine Learning Approach.通过机器学习方法预测多药耐药相关蛋白2转运体的抑制剂
Comb Chem High Throughput Screen. 2018;21(8):557-566. doi: 10.2174/1386207321666181024104822.
5
Using kernel alignment to select features of molecular descriptors in a QSAR study.使用核对齐选择 QSAR 研究中分子描述符的特征。
IEEE/ACM Trans Comput Biol Bioinform. 2011 Sep-Oct;8(5):1373-84. doi: 10.1109/TCBB.2011.31.
6
QSAR prediction of HIV-1 protease inhibitory activities using docking derived molecular descriptors.使用对接衍生分子描述符对HIV-1蛋白酶抑制活性进行定量构效关系预测。
J Theor Biol. 2015 Mar 21;369:13-22. doi: 10.1016/j.jtbi.2015.01.008. Epub 2015 Jan 16.
7
Combining handcrafted features with latent variables in machine learning for prediction of radiation-induced lung damage.将机器学习中的手工特征与潜在变量相结合,以预测放射性肺损伤。
Med Phys. 2019 May;46(5):2497-2511. doi: 10.1002/mp.13497. Epub 2019 Apr 8.
8
Application of GA-MLR for QSAR Modeling of the Arylthioindole Class of Tubulin Polymerization Inhibitors as Anticancer Agents.遗传算法-多元线性回归在作为抗癌剂的芳基硫代吲哚类微管蛋白聚合抑制剂定量构效关系建模中的应用。
Anticancer Agents Med Chem. 2017;17(4):552-565. doi: 10.2174/1871520616666160811162105.
9
Prediction of bioactivities of microsomal prostaglandin E synthase-1 inhibitors by machine learning algorithms.通过机器学习算法预测微粒体前列腺素E合酶-1抑制剂的生物活性
Chem Biol Drug Des. 2023 Jun;101(6):1307-1321. doi: 10.1111/cbdd.14214. Epub 2023 Feb 20.
10
Design of potential anti-tumor PARP-1 inhibitors by QSAR and molecular modeling studies.通过 QSAR 和分子建模研究设计潜在的抗肿瘤 PARP-1 抑制剂。
Mol Divers. 2021 Feb;25(1):263-277. doi: 10.1007/s11030-020-10063-9. Epub 2020 Mar 5.

引用本文的文献

1
Predicting protection capacities of pyrimidine-based corrosion inhibitors for mild steel/HCl interface using linear and nonlinear QSPR models.使用线性和非线性定量构效关系(QSPR)模型预测嘧啶基缓蚀剂对低碳钢/盐酸界面的保护能力。
J Mol Model. 2022 Aug 11;28(9):254. doi: 10.1007/s00894-022-05245-1.
2
A Prediction Model for Neurological Deterioration in Patients with Acute Spontaneous Intracerebral Hemorrhage.急性自发性脑出血患者神经功能恶化的预测模型
Front Surg. 2022 May 27;9:886856. doi: 10.3389/fsurg.2022.886856. eCollection 2022.
3
A General Use QSAR-ARX Model to Predict the Corrosion Inhibition Efficiency of Drugs in Terms of Quantum Mechanical Descriptors and Experimental Comparison for Lidocaine.
一种通用的定量构效关系-人工神经网络模型,用于根据量子力学描述符预测药物的缓蚀效率,并以利多卡因为例进行实验比较。
Int J Mol Sci. 2022 May 3;23(9):5086. doi: 10.3390/ijms23095086.
4
Screening Potential Drugs for COVID-19 Based on Bound Nuclear Norm Regularization.基于核范数约束的新型冠状病毒肺炎潜在药物筛选
Front Genet. 2021 Oct 7;12:749256. doi: 10.3389/fgene.2021.749256. eCollection 2021.
5
Using machine learning to investigate the relationship between domains of functioning and functional mobility in older adults.利用机器学习研究老年人功能领域与功能性移动能力之间的关系。
PLoS One. 2021 Feb 11;16(2):e0246397. doi: 10.1371/journal.pone.0246397. eCollection 2021.
6
PredAmyl-MLP: Prediction of Amyloid Proteins Using Multilayer Perceptron.PredAmyl-MLP:使用多层感知机预测淀粉样蛋白
Comput Math Methods Med. 2020 Nov 20;2020:8845133. doi: 10.1155/2020/8845133. eCollection 2020.
7
T4SE-XGB: Interpretable Sequence-Based Prediction of Type IV Secreted Effectors Using eXtreme Gradient Boosting Algorithm.T4SE-XGB:使用极端梯度提升算法对IV型分泌效应蛋白进行基于序列的可解释预测。
Front Microbiol. 2020 Sep 24;11:580382. doi: 10.3389/fmicb.2020.580382. eCollection 2020.