探索用于活性悬崖预测的定量构效关系模型。

Exploring QSAR models for activity-cliff prediction.

作者信息

Dablander Markus, Hanser Thierry, Lambiotte Renaud, Morris Garrett M

机构信息

Mathematical Institute, University of Oxford, Andrew Wiles Building, Radcliffe Observatory Quarter (550), Woodstock Road, Oxford, OX2 6GG, UK.

Lhasa Limited, Granary Wharf House, 2 Canal Wharf, Leeds, LS11 5PS, UK.

出版信息

J Cheminform. 2023 Apr 17;15(1):47. doi: 10.1186/s13321-023-00708-w.

DOI:10.1186/s13321-023-00708-w

PMID:37069675

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10107580/

Abstract

INTRODUCTION AND METHODOLOGY

Pairs of similar compounds that only differ by a small structural modification but exhibit a large difference in their binding affinity for a given target are known as activity cliffs (ACs). It has been hypothesised that QSAR models struggle to predict ACs and that ACs thus form a major source of prediction error. However, the AC-prediction power of modern QSAR methods and its quantitative relationship to general QSAR-prediction performance is still underexplored. We systematically construct nine distinct QSAR models by combining three molecular representation methods (extended-connectivity fingerprints, physicochemical-descriptor vectors and graph isomorphism networks) with three regression techniques (random forests, k-nearest neighbours and multilayer perceptrons); we then use each resulting model to classify pairs of similar compounds as ACs or non-ACs and to predict the activities of individual molecules in three case studies: dopamine receptor D2, factor Xa, and SARS-CoV-2 main protease.

RESULTS AND CONCLUSIONS

Our results provide strong support for the hypothesis that indeed QSAR models frequently fail to predict ACs. We observe low AC-sensitivity amongst the evaluated models when the activities of both compounds are unknown, but a substantial increase in AC-sensitivity when the actual activity of one of the compounds is given. Graph isomorphism features are found to be competitive with or superior to classical molecular representations for AC-classification and can thus be employed as baseline AC-prediction models or simple compound-optimisation tools. For general QSAR-prediction, however, extended-connectivity fingerprints still consistently deliver the best performance amongs the tested input representations. A potential future pathway to improve QSAR-modelling performance might be the development of techniques to increase AC-sensitivity.

摘要

引言与方法

仅通过微小结构修饰而有所不同，但对给定靶点的结合亲和力却存在巨大差异的相似化合物对，被称为活性断崖（ACs）。据推测，定量构效关系（QSAR）模型难以预测活性断崖，因此活性断崖构成了预测误差的主要来源。然而，现代QSAR方法的活性断崖预测能力及其与一般QSAR预测性能的定量关系仍未得到充分探索。我们通过将三种分子表示方法（扩展连接指纹、物理化学描述符向量和图同构网络）与三种回归技术（随机森林、k近邻和多层感知器）相结合，系统地构建了九个不同的QSAR模型；然后，在三个案例研究中，我们使用每个所得模型将相似化合物对分类为活性断崖或非活性断崖，并预测单个分子的活性：多巴胺受体D2、凝血因子Xa和严重急性呼吸综合征冠状病毒2（SARS-CoV-2）主要蛋白酶。

结果与结论

我们的结果为定量构效关系模型确实经常无法预测活性断崖这一假设提供了有力支持。当两种化合物的活性均未知时，我们在评估模型中观察到较低的活性断崖敏感性，但当给出其中一种化合物的实际活性时，活性断崖敏感性显著增加。发现图同构特征在活性断崖分类方面与经典分子表示具有竞争力或更优，因此可作为基线活性断崖预测模型或简单的化合物优化工具。然而，对于一般的定量构效关系预测，在测试的输入表示中，扩展连接指纹仍然始终如一地表现出最佳性能。提高定量构效关系建模性能的一个潜在未来途径可能是开发提高活性断崖敏感性的技术。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b1d/10108532/46819957c591/13321_2023_708_Fig1_HTML.jpg

相似文献

Exploring QSAR models for activity-cliff prediction.

J Cheminform. 2023 Apr 17;15(1):47. doi: 10.1186/s13321-023-00708-w.

Prediction of Activity Cliffs Using Condensed Graphs of Reaction Representations, Descriptor Recombination, Support Vector Machine Classification, and Support Vector Regression.

J Chem Inf Model. 2016 Sep 26;56(9):1631-40. doi: 10.1021/acs.jcim.6b00359. Epub 2016 Aug 26.

Large-scale prediction of activity cliffs using machine and deep learning methods of increasing complexity.

J Cheminform. 2023 Jan 7;15(1):4. doi: 10.1186/s13321-022-00676-7.

OLB-AC: toward optimizing ligand bioactivities through deep graph learning and activity cliffs.

Bioinformatics. 2024 Jun 3;40(6). doi: 10.1093/bioinformatics/btae365.

Prediction of activity cliffs on the basis of images using convolutional neural networks.

J Comput Aided Mol Des. 2021 Dec;35(12):1157-1164. doi: 10.1007/s10822-021-00380-y. Epub 2021 Mar 19.

Ligand-based Activity Cliff Prediction Models with Applicability Domain.

Mol Inform. 2020 Dec;39(12):e2000103. doi: 10.1002/minf.202000103. Epub 2020 Sep 11.

Performance of Deep and Shallow Neural Networks, the Universal Approximation Theorem, Activity Cliffs, and QSAR.

Mol Inform. 2017 Jan;36(1-2). doi: 10.1002/minf.201600118. Epub 2016 Oct 26.

Targeting HIV/HCV Coinfection Using a Machine Learning-Based Multiple Quantitative Structure-Activity Relationships (Multiple QSAR) Method.

Int J Mol Sci. 2019 Jul 22;20(14):3572. doi: 10.3390/ijms20143572.

QSAR-derived affinity fingerprints (part 2): modeling performance for potency prediction.

J Cheminform. 2020 Jun 5;12(1):41. doi: 10.1186/s13321-020-00444-5.

Hyperbolic relational graph convolution networks plus: a simple but highly efficient QSAR-modeling method.

Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab112.

引用本文的文献

ACtriplet: An improved deep learning model for activity cliffs prediction by in tegrating triplet loss and pre-training.

J Pharm Anal. 2025 Aug;15(8):101317. doi: 10.1016/j.jpha.2025.101317. Epub 2025 Apr 21.

The topology of molecular representations and its influence on machine learning performance.

J Cheminform. 2025 Jul 21;17(1):109. doi: 10.1186/s13321-025-01045-w.

ACES-GNN: can graph neural network learn to explain activity cliffs?

Digit Discov. 2025 Jun 30. doi: 10.1039/d5dd00012b.

GraphGIM: rethinking molecular graph contrastive learning via geometry image modeling.

BMC Biol. 2025 Jul 1;23(1):189. doi: 10.1186/s12915-025-02249-0.

The Quasi-Bound State as a Predictor of Relative Binding Free Energy.

J Chem Inf Model. 2025 Jun 9;65(11):5544-5552. doi: 10.1021/acs.jcim.5c00289. Epub 2025 May 20.

Machine Learning for Toxicity Prediction Using Chemical Structures: Pillars for Success in the Real World.

Chem Res Toxicol. 2025 May 19;38(5):759-807. doi: 10.1021/acs.chemrestox.5c00033. Epub 2025 May 2.

Activity cliff-aware reinforcement learning for de novo drug design.

J Cheminform. 2025 Apr 21;17(1):54. doi: 10.1186/s13321-025-01006-3.

Activity Cliff-Informed Contrastive Learning for Molecular Property Prediction.

Res Sq. 2024 Dec 4:rs.3.rs-2988283. doi: 10.21203/rs.3.rs-2988283/v2.

Sort & Slice: a simple and superior alternative to hash-based folding for extended-connectivity fingerprints.

J Cheminform. 2024 Dec 3;16(1):135. doi: 10.1186/s13321-024-00932-y.

Extended Activity Cliffs-Driven Approaches on Data Splitting for the Study of Bioactivity Machine Learning Predictions.

Mol Inform. 2025 Jan;44(1):e202400054. doi: 10.1002/minf.202400054. Epub 2024 Nov 18.

本文引用的文献

Exposing the Limitations of Molecular Machine Learning with Activity Cliffs.

J Chem Inf Model. 2022 Dec 12;62(23):5938-5951. doi: 10.1021/acs.jcim.2c01073. Epub 2022 Dec 1.

ACGCN: Graph Convolutional Networks for Activity Cliff Prediction between Matched Molecular Pairs.

J Chem Inf Model. 2022 May 23;62(10):2341-2351. doi: 10.1021/acs.jcim.2c00327. Epub 2022 May 6.

Using molecular embeddings in QSAR modeling: does it make a difference?

Brief Bioinform. 2022 Jan 17;23(1). doi: 10.1093/bib/bbab365.

Prediction of activity cliffs on the basis of images using convolutional neural networks.

J Comput Aided Mol Des. 2021 Dec;35(12):1157-1164. doi: 10.1007/s10822-021-00380-y. Epub 2021 Mar 19.

Could graph neural networks learn better molecular representation for drug discovery? A comparison study of descriptor-based and graph-based models.

J Cheminform. 2021 Feb 17;13(1):12. doi: 10.1186/s13321-020-00479-8.

Using Domain-Specific Fingerprints Generated Through Neural Networks to Enhance Ligand-Based Virtual Screening.

J Chem Inf Model. 2021 Feb 22;61(2):664-675. doi: 10.1021/acs.jcim.0c01208. Epub 2021 Jan 26.

An open source chemical structure curation pipeline using RDKit.

J Cheminform. 2020 Sep 1;12(1):51. doi: 10.1186/s13321-020-00456-1.

A comprehensive comparison of molecular feature representations for use in predictive modeling.

Comput Biol Med. 2021 Mar;130:104197. doi: 10.1016/j.compbiomed.2020.104197. Epub 2021 Jan 9.

Prediction of an MMP-1 inhibitor activity cliff using the SAR matrix approach and its experimental validation.

Sci Rep. 2020 Sep 7;10(1):14710. doi: 10.1038/s41598-020-71696-2.

Ligand-based Activity Cliff Prediction Models with Applicability Domain.

Mol Inform. 2020 Dec;39(12):e2000103. doi: 10.1002/minf.202000103. Epub 2020 Sep 11.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr
超能文献

探索用于活性悬崖预测的定量构效关系模型。

Exploring QSAR models for activity-cliff prediction.

作者信息

机构信息