• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用机器学习模型提高人激肽释放酶5抑制剂的虚拟筛选预测准确性。

Improving virtual screening predictive accuracy of Human kallikrein 5 inhibitors using machine learning models.

作者信息

Fang Xingang, Bagui Sikha, Bagui Subhash

机构信息

Department of Computer Science, University of West Florida, Pensacola, FL 32514, United States.

Department of Mathematics and Statistics, University of West Florida, Pensacola, FL 32514, United States.

出版信息

Comput Biol Chem. 2017 Aug;69:110-119. doi: 10.1016/j.compbiolchem.2017.05.007. Epub 2017 May 29.

DOI:10.1016/j.compbiolchem.2017.05.007
PMID:28601761
Abstract

The readily available high throughput screening (HTS) data from the PubChem database provides an opportunity for mining of small molecules in a variety of biological systems using machine learning techniques. From the thousands of available molecular descriptors developed to encode useful chemical information representing the characteristics of molecules, descriptor selection is an essential step in building an optimal quantitative structural-activity relationship (QSAR) model. For the development of a systematic descriptor selection strategy, we need the understanding of the relationship between: (i) the descriptor selection; (ii) the choice of the machine learning model; and (iii) the characteristics of the target bio-molecule. In this work, we employed the Signature descriptor to generate a dataset on the Human kallikrein 5 (hK 5) inhibition confirmatory assay data and compared multiple classification models including logistic regression, support vector machine, random forest and k-nearest neighbor. Under optimal conditions, the logistic regression model provided extremely high overall accuracy (98%) and precision (90%), with good sensitivity (65%) in the cross validation test. In testing the primary HTS screening data with more than 200K molecular structures, the logistic regression model exhibited the capability of eliminating more than 99.9% of the inactive structures. As part of our exploration of the descriptor-model-target relationship, the excellent predictive performance of the combination of the Signature descriptor and the logistic regression model on the assay data of the Human kallikrein 5 (hK 5) target suggested a feasible descriptor/model selection strategy on similar targets.

摘要

来自PubChem数据库的现成高通量筛选(HTS)数据为利用机器学习技术在各种生物系统中挖掘小分子提供了机会。在数千种用于编码表示分子特征的有用化学信息而开发的分子描述符中,描述符选择是构建最佳定量构效关系(QSAR)模型的关键步骤。为了制定系统的描述符选择策略,我们需要了解以下三者之间的关系:(i)描述符选择;(ii)机器学习模型的选择;(iii)目标生物分子的特征。在这项工作中,我们使用Signature描述符生成了关于人激肽释放酶5(hK 5)抑制确证试验数据的数据集,并比较了包括逻辑回归、支持向量机、随机森林和k近邻在内的多种分类模型。在最佳条件下,逻辑回归模型在交叉验证测试中提供了极高的总体准确率(98%)和精确率(90%),以及良好的灵敏度(65%)。在用超过20万个分子结构测试原始HTS筛选数据时,逻辑回归模型表现出能够排除超过99.9%的无活性结构的能力。作为我们对描述符-模型-目标关系探索的一部分,Signature描述符和逻辑回归模型的组合在人激肽释放酶5(hK 5)靶点的试验数据上的出色预测性能表明了一种针对类似靶点的可行描述符/模型选择策略。

相似文献

1
Improving virtual screening predictive accuracy of Human kallikrein 5 inhibitors using machine learning models.使用机器学习模型提高人激肽释放酶5抑制剂的虚拟筛选预测准确性。
Comput Biol Chem. 2017 Aug;69:110-119. doi: 10.1016/j.compbiolchem.2017.05.007. Epub 2017 May 29.
2
QSAR modeling of imbalanced high-throughput screening data in PubChem.基于PubChem中不平衡高通量筛选数据的定量构效关系建模
J Chem Inf Model. 2014 Mar 24;54(3):705-12. doi: 10.1021/ci400737s. Epub 2014 Feb 28.
3
Machine Learning Approaches Toward Building Predictive Models for Small Molecule Modulators of miRNA and Its Utility in Virtual Screening of Molecular Databases.构建miRNA小分子调节剂预测模型的机器学习方法及其在分子数据库虚拟筛选中的应用
Methods Mol Biol. 2017;1517:155-168. doi: 10.1007/978-1-4939-6563-2_11.
4
Automated Inference of Chemical Discriminants of Biological Activity.生物活性化学判别因子的自动推断
Methods Mol Biol. 2018;1762:307-338. doi: 10.1007/978-1-4939-7756-7_16.
5
Data mining PubChem using a support vector machine with the Signature molecular descriptor: classification of factor XIa inhibitors.使用带有特征分子描述符的支持向量机挖掘PubChem数据:凝血因子Xa抑制剂的分类
J Mol Graph Model. 2008 Nov;27(4):466-75. doi: 10.1016/j.jmgm.2008.08.004. Epub 2008 Aug 27.
6
Binary classification of a large collection of environmental chemicals from estrogen receptor assays by quantitative structure-activity relationship and machine learning methods.通过定量构效关系和机器学习方法对来自雌激素受体测定的大量环境化学物质进行二元分类。
J Chem Inf Model. 2013 Dec 23;53(12):3244-61. doi: 10.1021/ci400527b. Epub 2013 Dec 11.
7
QSAR classification-based virtual screening followed by molecular docking studies for identification of potential inhibitors of 5-lipoxygenase.基于 QSAR 分类的虚拟筛选结合分子对接研究鉴定 5-脂氧合酶潜在抑制剂。
Comput Biol Chem. 2018 Dec;77:154-166. doi: 10.1016/j.compbiolchem.2018.10.002. Epub 2018 Oct 5.
8
Quantitative structure-activity relationship analysis and virtual screening studies for identifying HDAC2 inhibitors from known HDAC bioactive chemical libraries.从已知的组蛋白去乙酰化酶(HDAC)生物活性化学文库中鉴定HDAC2抑制剂的定量构效关系分析和虚拟筛选研究。
SAR QSAR Environ Res. 2017 Mar;28(3):199-220. doi: 10.1080/1062936X.2017.1294198. Epub 2017 Feb 28.
9
Multiple machine learning based descriptive and predictive workflow for the identification of potential PTP1B inhibitors.基于多种机器学习的用于识别潜在蛋白酪氨酸磷酸酶1B(PTP1B)抑制剂的描述性和预测性工作流程。
J Mol Graph Model. 2017 Jan;71:242-256. doi: 10.1016/j.jmgm.2016.10.020. Epub 2016 Nov 3.
10
Development of CYP3A4 inhibition models: comparisons of machine-learning techniques and molecular descriptors.CYP3A4抑制模型的开发:机器学习技术与分子描述符的比较
J Biomol Screen. 2005 Apr;10(3):197-205. doi: 10.1177/1087057104274091.

引用本文的文献

1
Artificial intelligence and machine-learning approaches in structure and ligand-based discovery of drugs affecting central nervous system.基于结构和配体的影响中枢神经系统药物发现中的人工智能和机器学习方法
Mol Divers. 2023 Apr;27(2):959-985. doi: 10.1007/s11030-022-10489-3. Epub 2022 Jul 11.
2
Computational Screening of Potential Inhibitors of for Pyrite Scale Prevention in Oil and Gas Wells.用于油气井中防止黄铁矿结垢的潜在抑制剂的计算筛选
ACS Omega. 2021 Apr 13;6(16):10607-10617. doi: 10.1021/acsomega.0c06078. eCollection 2021 Apr 27.
3
Predictive Models to Identify Small Molecule Activators and Inhibitors of Opioid Receptors.
预测模型以鉴定阿片受体的小分子激活剂和抑制剂。
J Chem Inf Model. 2021 Jun 28;61(6):2675-2685. doi: 10.1021/acs.jcim.1c00439. Epub 2021 May 28.
4
Artificial intelligence and machine learning-aided drug discovery in central nervous system diseases: State-of-the-arts and future directions.人工智能和机器学习辅助中枢神经系统疾病药物发现:现状与未来方向。
Med Res Rev. 2021 May;41(3):1427-1473. doi: 10.1002/med.21764. Epub 2020 Dec 9.
5
Optimization of a Deep-Learning Method Based on the Classification of Images Generated by Parameterized Deep Snap a Novel Molecular-Image-Input Technique for Quantitative Structure-Activity Relationship (QSAR) Analysis.基于参数化深度快照生成图像分类的深度学习方法优化:一种用于定量构效关系(QSAR)分析的新型分子图像输入技术
Front Bioeng Biotechnol. 2019 Mar 28;7:65. doi: 10.3389/fbioe.2019.00065. eCollection 2019.