用于预测结合亲和力的机器学习

Machine Learning to Predict Binding Affinity.

作者信息

Bitencourt-Ferreira Gabriela, de Azevedo Walter Filgueira

机构信息

Escola de Ciências da Saúde, Pontifícia Universidade Católica do Rio Grande do Sul-PUCRS, Porto Alegre, RS, Brazil.

出版信息

Methods Mol Biol. 2019;2053:251-273. doi: 10.1007/978-1-4939-9752-7_16.

DOI:10.1007/978-1-4939-9752-7_16

PMID:31452110

Abstract

Recent progress in the development of scientific libraries with machine-learning techniques paved the way for the implementation of integrated computational tools to predict ligand-binding affinity. The prediction of binding affinity uses the atomic coordinates of protein-ligand complexes. These new computational tools made application of a broad spectrum of machine-learning techniques to study protein-ligand interactions possible. The essential aspect of these machine-learning approaches is to train a new computational model by using technologies such as supervised machine-learning techniques, convolutional neural network, and random forest to mention the most commonly applied methods. In this chapter, we focus on supervised machine-learning techniques and their applications in the development of protein-targeted scoring functions for the prediction of binding affinity. We discuss the development of the program SAnDReS and its application to the creation of machine-learning models to predict inhibition of cyclin-dependent kinase and HIV-1 protease. Moreover, we describe the scoring function space, and how to use it to explain the development of targeted scoring functions.

摘要

利用机器学习技术开发科学库的最新进展为实施预测配体结合亲和力的综合计算工具铺平了道路。结合亲和力的预测使用蛋白质-配体复合物的原子坐标。这些新的计算工具使得应用广泛的机器学习技术来研究蛋白质-配体相互作用成为可能。这些机器学习方法的关键在于通过使用诸如监督机器学习技术、卷积神经网络和随机森林等技术来训练新的计算模型，这里仅提及最常用的方法。在本章中，我们重点关注监督机器学习技术及其在开发用于预测结合亲和力的蛋白质靶向评分函数中的应用。我们讨论了SAnDReS程序的开发及其在创建预测细胞周期蛋白依赖性激酶和HIV-1蛋白酶抑制作用的机器学习模型中的应用。此外，我们描述了评分函数空间，以及如何使用它来解释靶向评分函数的开发。