Yin Jian, Henriksen Niel M, Slochower David R, Shirts Michael R, Chiu Michael W, Mobley David L, Gilson Michael K
Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA.
Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO, 80309, USA.
J Comput Aided Mol Des. 2017 Jan;31(1):1-19. doi: 10.1007/s10822-016-9974-4. Epub 2016 Sep 22.
The ability to computationally predict protein-small molecule binding affinities with high accuracy would accelerate drug discovery and reduce its cost by eliminating rounds of trial-and-error synthesis and experimental evaluation of candidate ligands. As academic and industrial groups work toward this capability, there is an ongoing need for datasets that can be used to rigorously test new computational methods. Although protein-ligand data are clearly important for this purpose, their size and complexity make it difficult to obtain well-converged results and to troubleshoot computational methods. Host-guest systems offer a valuable alternative class of test cases, as they exemplify noncovalent molecular recognition but are far smaller and simpler. As a consequence, host-guest systems have been part of the prior two rounds of SAMPL prediction exercises, and they also figure in the present SAMPL5 round. In addition to being blinded, and thus avoiding biases that may arise in retrospective studies, the SAMPL challenges have the merit of focusing multiple researchers on a common set of molecular systems, so that methods may be compared and ideas exchanged. The present paper provides an overview of the host-guest component of SAMPL5, which centers on three different hosts, two octa-acids and a glycoluril-based molecular clip, and two different sets of guest molecules, in aqueous solution. A range of methods were applied, including electronic structure calculations with implicit solvent models; methods that combine empirical force fields with implicit solvent models; and explicit solvent free energy simulations. The most reliable methods tend to fall in the latter class, consistent with results in prior SAMPL rounds, but the level of accuracy is still below that sought for reliable computer-aided drug design. Advances in force field accuracy, modeling of protonation equilibria, electronic structure methods, and solvent models, hold promise for future improvements.
能够通过计算高精度地预测蛋白质与小分子的结合亲和力,将加速药物研发,并通过消除候选配体的多轮试错合成和实验评估来降低成本。随着学术和工业团体朝着这一能力努力,持续需要可用于严格测试新计算方法的数据集。尽管蛋白质-配体数据对于此目的显然很重要,但其规模和复杂性使得难以获得收敛良好的结果并对计算方法进行故障排除。主客体系统提供了一类有价值的替代测试案例,因为它们体现了非共价分子识别,但要小得多且简单得多。因此,主客体系统已成为前两轮SAMPL预测练习的一部分,并且在当前的SAMPL5轮中也有涉及。除了数据被保密,从而避免回顾性研究中可能出现的偏差外,SAMPL挑战还有利于让多个研究人员专注于一组共同的分子系统,以便可以比较方法并交流想法。本文概述了SAMPL5的主客体部分,其核心是三种不同的主体,两种八元酸和一种基于甘脲的分子夹,以及两组不同的客体分子,均处于水溶液中。应用了一系列方法,包括使用隐式溶剂模型的电子结构计算;将经验力场与隐式溶剂模型相结合的方法;以及显式溶剂自由能模拟。最可靠的方法往往属于后一类,这与之前SAMPL轮次的结果一致,但准确性水平仍低于可靠的计算机辅助药物设计所追求的水平。力场准确性、质子化平衡建模、电子结构方法和溶剂模型的进展有望在未来带来改进。