Pason Lukas P, Sotriffer Christoph A
Institute of Pharmacy and Food Chemistry, University of Würzburg, Am Hubland, D-97074, Würzburg, Germany.
Mol Inform. 2016 Dec;35(11-12):541-548. doi: 10.1002/minf.201600048. Epub 2016 Jul 8.
The ability to rapidly assess the quality of a protein-ligand complex in terms of its affinity is of fundamental importance for various methods of computer-aided drug design. While simple filtering or matching critieria may be sufficient in fast docking methods or at early stages of virtual screening, estimates of the actual free energy of binding are needed whenever refined docking solutions, ligand rankings or support for the optimization of hit compounds are required. If rigorous free energy calculations based on molecular simulations are impractical, such affinity estimates are provided by scoring functions. The class of empirical scoring functions aims to provide them via a regression-based approach. Using experimental structures and affinity data of protein-ligand complexes and descriptors suitable to capture the essential features of the interaction, these functions are trained with classical linear regression techniques or machine-learning methods. The latter have led to considerable improvements in terms of prediction accuracy for large generic data sets. Nevertheless, many limitations are not yet resolved and pose significant challenges for future developments.
能够根据亲和力快速评估蛋白质 - 配体复合物的质量,对于各种计算机辅助药物设计方法至关重要。虽然简单的过滤或匹配标准在快速对接方法或虚拟筛选的早期阶段可能就足够了,但每当需要精确的对接解决方案、配体排名或对命中化合物进行优化的支持时,就需要估计实际的结合自由能。如果基于分子模拟的严格自由能计算不切实际,那么这种亲和力估计就由评分函数提供。经验评分函数旨在通过基于回归的方法来提供这些估计。利用蛋白质 - 配体复合物的实验结构和亲和力数据以及适合捕捉相互作用基本特征的描述符,这些函数通过经典线性回归技术或机器学习方法进行训练。对于大型通用数据集,机器学习方法在预测准确性方面带来了显著提高。然而,许多局限性尚未解决,对未来的发展构成了重大挑战。