通过机器学习和主动学习评估结构及蛋白质-配体相互作用表示的信息内容，用于激酶抑制剂结合模式的分类。

Assessing the information content of structural and protein-ligand interaction representations for the classification of kinase inhibitor binding modes via machine learning and active learning.

作者信息

Rodríguez-Pérez Raquel, Miljković Filip, Bajorath Jürgen

机构信息

Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, 53115, Bonn, Germany.

出版信息

J Cheminform. 2020 May 24;12(1):36. doi: 10.1186/s13321-020-00434-7.

DOI:10.1186/s13321-020-00434-7

PMID:33431025

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7245824/

Abstract

For kinase inhibitors, X-ray crystallography has revealed different types of binding modes. Currently, more than 2000 kinase inhibitors with known binding modes are available, which makes it possible to derive and test machine learning models for the prediction of inhibitors with different binding modes. We have addressed this prediction task to evaluate and compare the information content of distinct molecular representations including protein-ligand interaction fingerprints (IFPs) and compound structure-based structural fingerprints (i.e., atom environment/fragment fingerprints). IFPs were designed to capture binding mode-specific interaction patterns at different resolution levels. Accurate predictions of kinase inhibitor binding modes were achieved with random forests using both representations. The performance of IFPs was consistently superior to atom environment fingerprints, albeit only by less than 10%. An active learning strategy applying information entropy-based selection of training instances was applied as a diagnostic approach to assess the relative information content of distinct representations. IFPs were found to capture more binding mode-relevant information than atom environment fingerprints, leading to highly predictive models even when training instances were randomly selected. By contrast, for atom environment fingerprints, the derivation of accurate models via active learning depended on entropy-based selection of informative training compounds. Notably, higher information content of IFPs confirmed by active learning only resulted in small improvements in global prediction accuracy compared to models derived using atom environment fingerprints. For practical applications, prediction of binding modes of new kinase inhibitors on the basis of chemical structure is highly attractive.

摘要

对于激酶抑制剂而言，X射线晶体学已揭示出不同类型的结合模式。目前，有2000多种具有已知结合模式的激酶抑制剂，这使得推导和测试用于预测不同结合模式抑制剂的机器学习模型成为可能。我们已着手处理这一预测任务，以评估和比较不同分子表示形式的信息含量，包括蛋白质-配体相互作用指纹（IFP）和基于化合物结构的结构指纹（即原子环境/片段指纹）。IFP旨在在不同分辨率水平上捕捉结合模式特异性的相互作用模式。使用这两种表示形式，通过随机森林实现了对激酶抑制剂结合模式的准确预测。IFP的性能始终优于原子环境指纹，尽管仅高出不到10%。一种应用基于信息熵的训练实例选择的主动学习策略被用作一种诊断方法，以评估不同表示形式的相对信息含量。结果发现，IFP比原子环境指纹捕捉到更多与结合模式相关的信息，即使在随机选择训练实例时也能产生高度预测性的模型。相比之下，对于原子环境指纹，通过主动学习推导准确模型依赖于基于熵的信息丰富的训练化合物选择。值得注意的是，与使用原子环境指纹推导的模型相比，主动学习证实的IFP的更高信息含量仅在全局预测准确性上带来了小幅提升。对于实际应用而言，基于化学结构预测新激酶抑制剂的结合模式极具吸引力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7508/7245824/ce701e4966f8/13321_2020_434_Fig1_HTML.jpg

相似文献

Assessing the information content of structural and protein-ligand interaction representations for the classification of kinase inhibitor binding modes via machine learning and active learning.

J Cheminform. 2020 May 24;12(1):36. doi: 10.1186/s13321-020-00434-7.

Proteo-chemometrics interaction fingerprints of protein-ligand complexes predict binding affinity.

Bioinformatics. 2021 Sep 9;37(17):2570-2579. doi: 10.1093/bioinformatics/btab132.

Machine Learning Models for Accurate Prediction of Kinase Inhibitors with Different Binding Modes.

J Med Chem. 2020 Aug 27;63(16):8738-8748. doi: 10.1021/acs.jmedchem.9b00867. Epub 2019 Aug 30.

Structure-based protein-ligand interaction fingerprints for binding affinity prediction.

Comput Struct Biotechnol J. 2021 Nov 25;19:6291-6300. doi: 10.1016/j.csbj.2021.11.018. eCollection 2021.

Interacting with GPCRs: Using Interaction Fingerprints for Virtual Screening.

J Chem Inf Model. 2016 Oct 24;56(10):2053-2060. doi: 10.1021/acs.jcim.6b00314. Epub 2016 Sep 27.

Molecular interaction fingerprint approaches for GPCR drug discovery.

Curr Opin Pharmacol. 2016 Oct;30:59-68. doi: 10.1016/j.coph.2016.07.007. Epub 2016 Jul 29.

Fingerprinting Interactions between Proteins and Ligands for Facilitating Machine Learning in Drug Discovery.

Biomolecules. 2024 Jan 5;14(1):72. doi: 10.3390/biom14010072.

Prediction of kinase inhibitors binding modes with machine learning and reduced descriptor sets.

Sci Rep. 2021 Jan 12;11(1):706. doi: 10.1038/s41598-020-80758-4.

Computational models for the classification of mPGES-1 inhibitors with fingerprint descriptors.

Mol Divers. 2017 Aug;21(3):661-675. doi: 10.1007/s11030-017-9743-x. Epub 2017 May 8.

J Chem Inf Comput Sci. 2004 Sep-Oct;44(5):1708-18. doi: 10.1021/ci0498719.

引用本文的文献

Advancing genetic engineering with active learning: theory, implementations and potential opportunities.

Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf286.

GRADE and X-GRADE: Unveiling Novel Protein-Ligand Interaction Fingerprints Based on GRAIL Scores.

J Chem Inf Model. 2025 Mar 10;65(5):2456-2475. doi: 10.1021/acs.jcim.4c01902. Epub 2025 Feb 20.

FOCUS on NOD2: Advancing IBD Drug Discovery with a User-Informed Machine Learning Framework.

ACS Med Chem Lett. 2024 Jun 6;15(7):1057-1070. doi: 10.1021/acsmedchemlett.4c00148. eCollection 2024 Jul 11.

AiKPro: deep learning model for kinome-wide bioactivity profiling using structure-based sequence alignments and molecular 3D conformer ensemble descriptors.

Sci Rep. 2023 Jun 24;13(1):10268. doi: 10.1038/s41598-023-37456-8.

ProLIF: a library to encode molecular interactions as fingerprints.

J Cheminform. 2021 Sep 25;13(1):72. doi: 10.1186/s13321-021-00548-6.

Prediction of kinase inhibitors binding modes with machine learning and reduced descriptor sets.

Sci Rep. 2021 Jan 12;11(1):706. doi: 10.1038/s41598-020-80758-4.

From Big Data to Artificial Intelligence: chemoinformatics meets new challenges.

J Cheminform. 2020 Dec 18;12(1):74. doi: 10.1186/s13321-020-00475-y.

KLIFS: an overhaul after the first 5 years of supporting kinase research.

Nucleic Acids Res. 2021 Jan 8;49(D1):D562-D569. doi: 10.1093/nar/gkaa895.

本文引用的文献

Machine Learning Models for Accurate Prediction of Kinase Inhibitors with Different Binding Modes.

J Med Chem. 2020 Aug 27;63(16):8738-8748. doi: 10.1021/acs.jmedchem.9b00867. Epub 2019 Aug 30.

Life beyond the Tanimoto coefficient: similarity measures for interaction fingerprints.

J Cheminform. 2018 Oct 4;10(1):48. doi: 10.1186/s13321-018-0302-y.

Exploring Selectivity of Multikinase Inhibitors across the Human Kinome.

ACS Omega. 2018 Jan 31;3(1):1147-1153. doi: 10.1021/acsomega.7b01960. Epub 2018 Jan 26.

The target landscape of clinical kinase drugs.

Science. 2017 Dec 1;358(6367). doi: 10.1126/science.aan4368.

Entering the 'big data' era in medicinal chemistry: molecular promiscuity analysis revisited.

Future Sci OA. 2017 Mar 6;3(2):FSO179. doi: 10.4155/fsoa-2017-0001. eCollection 2017 Jun.

Prediction of Protein Kinase-Ligand Interactions through 2.5D Kinochemometrics.

J Chem Inf Model. 2017 Jan 23;57(1):93-101. doi: 10.1021/acs.jcim.6b00520. Epub 2017 Jan 3.

Classification of small molecule protein kinase inhibitors based upon the structures of their drug-enzyme complexes.

Pharmacol Res. 2016 Jan;103:26-48. doi: 10.1016/j.phrs.2015.10.021. Epub 2015 Oct 31.

KLIFS: a structural kinase-ligand interaction database.

Nucleic Acids Res. 2016 Jan 4;44(D1):D365-71. doi: 10.1093/nar/gkv1082. Epub 2015 Oct 22.

The ins and outs of selective kinase inhibitor development.

Nat Chem Biol. 2015 Nov;11(11):818-21. doi: 10.1038/nchembio.1938.

Structural protein-ligand interaction fingerprints (SPLIF) for structure-based virtual screening: method and benchmark study.

J Chem Inf Model. 2014 Sep 22;54(9):2555-61. doi: 10.1021/ci500319f. Epub 2014 Aug 20.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

通过机器学习和主动学习评估结构及蛋白质-配体相互作用表示的信息内容，用于激酶抑制剂结合模式的分类。

Assessing the information content of structural and protein-ligand interaction representations for the classification of kinase inhibitor binding modes via machine learning and active learning.

作者信息

Rodríguez-Pérez Raquel, Miljković Filip, Bajorath Jürgen

机构信息

Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, 53115, Bonn, Germany.

出版信息

J Cheminform. 2020 May 24;12(1):36. doi: 10.1186/s13321-020-00434-7.

DOI:10.1186/s13321-020-00434-7

PMID:33431025

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7245824/

Abstract

摘要

通过机器学习和主动学习评估结构及蛋白质-配体相互作用表示的信息内容，用于激酶抑制剂结合模式的分类。

Assessing the information content of structural and protein-ligand interaction representations for the classification of kinase inhibitor binding modes via machine learning and active learning.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

通过机器学习和主动学习评估结构及蛋白质-配体相互作用表示的信息内容，用于激酶抑制剂结合模式的分类。

Assessing the information content of structural and protein-ligand interaction representations for the classification of kinase inhibitor binding modes via machine learning and active learning.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献