Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, 65211, USA.
Department of Biochemistry, University of Missouri, Columbia, MO, 65211, USA.
Brief Bioinform. 2022 Jan 17;23(1). doi: 10.1093/bib/bbab476.
New drug production, from target identification to marketing approval, takes over 12 years and can cost around $2.6 billion. Furthermore, the COVID-19 pandemic has unveiled the urgent need for more powerful computational methods for drug discovery. Here, we review the computational approaches to predicting protein-ligand interactions in the context of drug discovery, focusing on methods using artificial intelligence (AI). We begin with a brief introduction to proteins (targets), ligands (e.g. drugs) and their interactions for nonexperts. Next, we review databases that are commonly used in the domain of protein-ligand interactions. Finally, we survey and analyze the machine learning (ML) approaches implemented to predict protein-ligand binding sites, ligand-binding affinity and binding pose (conformation) including both classical ML algorithms and recent deep learning methods. After exploring the correlation between these three aspects of protein-ligand interaction, it has been proposed that they should be studied in unison. We anticipate that our review will aid exploration and development of more accurate ML-based prediction strategies for studying protein-ligand interactions.
新药的生产,从靶点确证到上市审批,需要超过 12 年的时间,花费约 26 亿美元。此外,COVID-19 大流行凸显了药物发现中更强大的计算方法的迫切需求。在这里,我们回顾了药物发现中预测蛋白-配体相互作用的计算方法,重点介绍了使用人工智能 (AI) 的方法。我们首先简要介绍一下非专业人士的蛋白质(靶点)、配体(如药物)及其相互作用。接下来,我们回顾了在蛋白-配体相互作用领域常用的数据库。最后,我们调查和分析了用于预测蛋白-配体结合位点、配体结合亲和力和结合构象(构象)的机器学习 (ML) 方法,包括经典 ML 算法和最近的深度学习方法。在探讨了蛋白-配体相互作用的这三个方面之间的相关性后,提出应该将它们一并研究。我们预计,我们的综述将有助于探索和开发更准确的基于 ML 的预测策略,以研究蛋白-配体相互作用。