溶剂化药物分子的量子力学性质预测：从 SAMPL 盲测挑战的十年中我们学到了什么？

Quantum-mechanical property prediction of solvated drug molecules: what have we learned from a decade of SAMPL blind prediction challenges?

机构信息

Physikalische Chemie III, Technische Universität Dortmund, Otto-Hahn-Str. 4a, 44227, Dortmund, Germany.

R&D Integrated Drug Discovery, Sanofi-Aventis Deutschland GmbH, 65926, Frankfurt am Main, Germany.

出版信息

J Comput Aided Mol Des. 2021 Apr;35(4):453-472. doi: 10.1007/s10822-020-00347-5. Epub 2020 Oct 20.

DOI:10.1007/s10822-020-00347-5

PMID:33079358

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8018924/

Abstract

Joint academic-industrial projects supporting drug discovery are frequently pursued to deploy and benchmark cutting-edge methodical developments from academia in a real-world industrial environment at different scales. The dimensionality of tasks ranges from small molecule physicochemical property assessment over protein-ligand interaction up to statistical analyses of biological data. This way, method development and usability both benefit from insights gained at both ends, when predictiveness and readiness of novel approaches are confirmed, but the pharmaceutical drug makers get early access to novel tools for the quality of drug products and benefit of patients. Quantum-mechanical and simulation methods particularly fall into this group of methods, as they require skills and expense in their development but also significant resources in their application, thus are comparatively slowly dripping into the realm of industrial use. Nevertheless, these physics-based methods are becoming more and more useful. Starting with a general overview of these and in particular quantum-mechanical methods for drug discovery we review a decade-long and ongoing collaboration between Sanofi and the Kast group focused on the application of the embedded cluster reference interaction site model (EC-RISM), a solvation model for quantum chemistry, to study small molecule chemistry in the context of joint participation in several SAMPL (Statistical Assessment of Modeling of Proteins and Ligands) blind prediction challenges. Starting with early application to tautomer equilibria in water (SAMPL2) the methodology was further developed to allow for challenge contributions related to predictions of distribution coefficients (SAMPL5) and acidity constants (SAMPL6) over the years. Particular emphasis is put on a frequently overlooked aspect of measuring the quality of models, namely the retrospective analysis of earlier datasets and predictions in light of more recent and advanced developments. We therefore demonstrate the performance of the current methodical state of the art as developed and optimized for the SAMPL6 pK and octanol-water log P challenges when re-applied to the earlier SAMPL5 cyclohexane-water log D and SAMPL2 tautomer equilibria datasets. Systematic improvement is not consistently found throughout despite the similarity of the problem class, i.e. protonation reactions and phase distribution. Hence, it is possible to learn about hidden bias in model assessment, as results derived from more elaborate methods do not necessarily improve quantitative agreement. This indicates the role of chance or coincidence for model development on the one hand which allows for the identification of systematic error and opportunities toward improvement and reveals possible sources of experimental uncertainty on the other. These insights are particularly useful for further academia-industry collaborations, as both partners are then enabled to optimize both the computational and experimental settings for data generation.

摘要

产学研合作项目经常被用于将学术领域的前沿方法学发展应用于现实工业环境中，并在不同规模下进行基准测试。任务的维度从小分子的物理化学性质评估、蛋白质-配体相互作用到生物数据的统计分析都有涉及。这样一来，当新方法的预测能力和准备程度得到验证时，方法的开发和可用性都可以从双方获得的见解中受益，制药商也可以尽早获得新药产品质量和患者受益的新工具。量子力学和模拟方法尤其属于这类方法，因为它们在开发过程中需要技能和费用，并且在应用中也需要大量资源，因此相对较慢地融入工业应用领域。尽管如此，这些基于物理的方法变得越来越有用。我们从对这些方法，特别是用于药物发现的量子力学方法的概述开始，回顾了赛诺菲（Sanofi）和 Kast 小组之间长达十年的合作，重点是应用嵌入簇参考相互作用点模型（EC-RISM），一种用于量子化学的溶剂化模型，来研究小分子化学，合作内容包括联合参与几个 SAMPL（蛋白质和配体建模的统计评估）盲测挑战。该方法最早应用于水中互变异构平衡（SAMPL2），随后多年来进一步发展，以允许对与预测分配系数（SAMPL5）和酸度常数（SAMPL6）相关的挑战做出贡献。特别强调的是，在回顾性分析早期数据集和预测时，经常忽略了衡量模型质量的一个方面，即根据最近和更先进的发展来分析。因此，我们展示了当前方法学最先进的性能，这些性能是针对 SAMPL6 pK 和辛醇-水 log P 挑战开发和优化的，当重新应用于早期的 SAMPL5 环己烷-水 log D 和 SAMPL2 互变异构平衡数据集时，性能得到了显著提高。尽管问题类（质子化反应和相分布）相似，但并未始终发现系统改进。因此，可以了解模型评估中的隐藏偏差，因为来自更复杂方法的结果不一定能提高定量一致性。这表明模型开发中的机会或巧合作用一方面允许识别系统误差和改进机会，并揭示实验不确定性的可能来源。这些见解对于进一步的产学研合作特别有用，因为双方都能够优化计算和实验设置以生成数据。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0859/8018924/24aa83bab7ec/10822_2020_347_Fig1_HTML.jpg

相似文献

Quantum-mechanical property prediction of solvated drug molecules: what have we learned from a decade of SAMPL blind prediction challenges?溶剂化药物分子的量子力学性质预测：从 SAMPL 盲测挑战的十年中我们学到了什么？

J Comput Aided Mol Des. 2021 Apr;35(4):453-472. doi: 10.1007/s10822-020-00347-5. Epub 2020 Oct 20.

The SAMPL5 challenge for embedded-cluster integral equation theory: solvation free energies, aqueous pK , and cyclohexane-water log D.嵌入式簇积分方程理论的SAMPL5挑战：溶剂化自由能、水相pK及环己烷-水的log D

J Comput Aided Mol Des. 2016 Nov;30(11):1035-1044. doi: 10.1007/s10822-016-9939-7. Epub 2016 Aug 23.

Blind prediction of cyclohexane-water distribution coefficients from the SAMPL5 challenge.基于SAMPL5挑战对环己烷-水分配系数的盲预测。

J Comput Aided Mol Des. 2016 Nov;30(11):927-944. doi: 10.1007/s10822-016-9954-8. Epub 2016 Sep 27.

Assessing the accuracy of octanol-water partition coefficient predictions in the SAMPL6 Part II log P Challenge.评估 SAMPL6 第 II 部分 log P 挑战中辛醇-水分配系数预测的准确性。

J Comput Aided Mol Des. 2020 Apr;34(4):335-370. doi: 10.1007/s10822-020-00295-0. Epub 2020 Feb 27.

The SAMPL6 challenge on predicting octanol-water partition coefficients from EC-RISM theory.SAMPL6 挑战赛：从 EC-RISM 理论预测辛醇-水分配系数。

J Comput Aided Mol Des. 2020 Apr;34(4):453-461. doi: 10.1007/s10822-020-00283-4. Epub 2020 Jan 24.

SAMPL7 physical property prediction from EC-RISM theory.基于 EC-RISM 理论预测 SAMPL7 的物理性质。

J Comput Aided Mol Des. 2021 Aug;35(8):933-941. doi: 10.1007/s10822-021-00410-9. Epub 2021 Jul 19.

Overview of the SAMPL6 pK challenge: evaluating small molecule microscopic and macroscopic pK predictions.SAMPL6 pK 挑战概述：评估小分子微观和宏观 pK 预测。

J Comput Aided Mol Des. 2021 Feb;35(2):131-166. doi: 10.1007/s10822-020-00362-6. Epub 2021 Jan 4.

Extended solvent-contact model approach to blind SAMPL5 prediction challenge for the distribution coefficients of drug-like molecules.扩展溶剂接触模型方法用于药物类分子分配系数的盲SAMPL5预测挑战

J Comput Aided Mol Des. 2016 Nov;30(11):1019-1033. doi: 10.1007/s10822-016-9928-x. Epub 2016 Jul 23.

Measuring experimental cyclohexane-water distribution coefficients for the SAMPL5 challenge.测量用于SAMPL5挑战的实验性环己烷-水分配系数。

J Comput Aided Mol Des. 2016 Nov;30(11):945-958. doi: 10.1007/s10822-016-9971-7. Epub 2016 Oct 7.

Prediction of cyclohexane-water distribution coefficients with COSMO-RS on the SAMPL5 data set.使用COSMO-RS对SAMPL5数据集进行环己烷-水分配系数的预测。

J Comput Aided Mol Des. 2016 Nov;30(11):959-967. doi: 10.1007/s10822-016-9927-y. Epub 2016 Jul 26.

引用本文的文献

Solvent-Controlled Separation of Integratively Self-Sorted PdL L Coordination Cages.溶剂控制的整合自分类PdL L 钴配位笼的分离

Angew Chem Int Ed Engl. 2025 Jan 21;64(4):e202416076. doi: 10.1002/anie.202416076. Epub 2024 Nov 14.

An Imbalance in the Force: The Need for Standardized Benchmarks for Molecular Simulation.力的失衡：分子模拟标准化基准的必要性。

J Chem Inf Model. 2023 Jan 23;63(2):412-431. doi: 10.1021/acs.jcim.2c01127. Epub 2023 Jan 11.

Implementation and Optimization of the Embedded Cluster Reference Interaction Site Model with Atomic Charges.含原子电荷的嵌入式簇参考相互作用位点模型的实现与优化

J Phys Chem A. 2022 Apr 21;126(15):2417-2429. doi: 10.1021/acs.jpca.1c07904. Epub 2022 Apr 8.

Asymmetric Interplay Between K and Blocker and Atomistic Parameters From Physiological Experiments Quantify K Channel Blocker Release.钾离子与阻滞剂之间的不对称相互作用以及来自生理实验的原子参数量化钾通道阻滞剂的释放。

Front Physiol. 2021 Oct 29;12:737834. doi: 10.3389/fphys.2021.737834. eCollection 2021.

Fitting quantum machine learning potentials to experimental free energy data: predicting tautomer ratios in solution.将量子机器学习势拟合到实验自由能数据：预测溶液中的互变异构体比例。

Chem Sci. 2021 Jul 19;12(34):11364-11381. doi: 10.1039/d1sc01185e. eCollection 2021 Sep 1.

SAMPL7 physical property prediction from EC-RISM theory.基于 EC-RISM 理论预测 SAMPL7 的物理性质。

J Comput Aided Mol Des. 2021 Aug;35(8):933-941. doi: 10.1007/s10822-021-00410-9. Epub 2021 Jul 19.

Evaluation of log P, pK, and log D predictions from the SAMPL7 blind challenge.SAMPL7 盲测中预测 log P、pK 和 log D 的评估。

J Comput Aided Mol Des. 2021 Jul;35(7):771-802. doi: 10.1007/s10822-021-00397-3. Epub 2021 Jun 24.

本文引用的文献

Extending the Applicability of the ANI Deep Learning Molecular Potential to Sulfur and Halogens.将 ANI 深度学习分子势的适用性扩展到硫和卤素。

J Chem Theory Comput. 2020 Jul 14;16(7):4192-4202. doi: 10.1021/acs.jctc.0c00121. Epub 2020 Jun 29.

Rigorous Free Energy Simulations in Virtual Screening.在虚拟筛选中进行严格的自由能模拟。

J Chem Inf Model. 2020 Sep 28;60(9):4153-4169. doi: 10.1021/acs.jcim.0c00116. Epub 2020 Jun 16.

Artificial intelligence in chemistry and drug design.化学与药物设计中的人工智能

J Comput Aided Mol Des. 2020 Jul;34(7):709-715. doi: 10.1007/s10822-020-00317-x.

Automated De Novo Design in Medicinal Chemistry: Which Types of Chemistry Does a Generative Neural Network Learn?自动化药物化学从头设计：生成式神经网络学习哪些类型的化学？

J Med Chem. 2020 Aug 27;63(16):8809-8823. doi: 10.1021/acs.jmedchem.9b02044. Epub 2020 Mar 20.

Tautomeric Equilibria of Nucleobases in the Hachimoji Expanded Genetic Alphabet.碱基对在八进制扩展遗传密码中的互变异构平衡。

J Chem Theory Comput. 2020 Apr 14;16(4):2766-2777. doi: 10.1021/acs.jctc.9b01079. Epub 2020 Mar 20.

The SAMPL6 challenge on predicting octanol-water partition coefficients from EC-RISM theory.SAMPL6 挑战赛：从 EC-RISM 理论预测辛醇-水分配系数。

J Comput Aided Mol Des. 2020 Apr;34(4):453-461. doi: 10.1007/s10822-020-00283-4. Epub 2020 Jan 24.

C-H Functionalization-Prediction of Selectivity in Iridium(I)-Catalyzed Hydrogen Isotope Exchange Competition Reactions.铱（I）催化的氢同位素交换竞争反应中 C-H 功能化选择性的预测。

Angew Chem Int Ed Engl. 2020 Mar 27;59(14):5626-5631. doi: 10.1002/anie.201914220. Epub 2020 Feb 3.

Octanol-water partition coefficient measurements for the SAMPL6 blind prediction challenge.辛醇-水分配系数测量 SAMPL6 盲测挑战。

J Comput Aided Mol Des. 2020 Apr;34(4):405-420. doi: 10.1007/s10822-019-00271-3. Epub 2019 Dec 19.

Rethinking drug design in the artificial intelligence era.人工智能时代的药物设计再思考。

Nat Rev Drug Discov. 2020 May;19(5):353-364. doi: 10.1038/s41573-019-0050-3. Epub 2019 Dec 4.

Unifying machine learning and quantum chemistry with a deep neural network for molecular wavefunctions.用深度神经网络统一机器学习和量子化学以获得分子波函数。

Nat Commun. 2019 Nov 15;10(1):5024. doi: 10.1038/s41467-019-12875-2.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

溶剂化药物分子的量子力学性质预测：从 SAMPL 盲测挑战的十年中我们学到了什么？

Quantum-mechanical property prediction of solvated drug molecules: what have we learned from a decade of SAMPL blind prediction challenges?

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献