Suppr超能文献

分子动力学模拟能否通过机器学习改进对蛋白质-配体结合亲和力的预测?

Can molecular dynamics simulations improve predictions of protein-ligand binding affinity with machine learning?

作者信息

Gu Shukai, Shen Chao, Yu Jiahui, Zhao Hong, Liu Huanxiang, Liu Liwei, Sheng Rong, Xu Lei, Wang Zhe, Hou Tingjun, Kang Yu

机构信息

Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China.

Faculty of Applied Science, Macao Polytechnic University, Macao, SAR, China.

出版信息

Brief Bioinform. 2023 Mar 19;24(2). doi: 10.1093/bib/bbad008.

Abstract

Binding affinity prediction largely determines the discovery efficiency of lead compounds in drug discovery. Recently, machine learning (ML)-based approaches have attracted much attention in hopes of enhancing the predictive performance of traditional physics-based approaches. In this study, we evaluated the impact of structural dynamic information on the binding affinity prediction by comparing the models trained on different dimensional descriptors, using three targets (i.e. JAK1, TAF1-BD2 and DDR1) and their corresponding ligands as the examples. Here, 2D descriptors are traditional ECFP4 fingerprints, 3D descriptors are the energy terms of the Smina and NNscore scoring functions and 4D descriptors contain the structural dynamic information derived from the trajectories based on molecular dynamics (MD) simulations. We systematically investigate the MD-refined binding affinity prediction performance of three classical ML algorithms (i.e. RF, SVR and XGB) as well as two common virtual screening methods, namely Glide docking and MM/PBSA. The outcomes of the ML models built using various dimensional descriptors and their combinations reveal that the MD refinement with the optimized protocol can improve the predictive performance on the TAF1-BD2 target with considerable structural flexibility, but not for the less flexible JAK1 and DDR1 targets, when taking docking poses as the initial structure instead of the crystal structures. The results highlight the importance of the initial structures to the final performance of the model through conformational analysis on the three targets with different flexibility.

摘要

结合亲和力预测在很大程度上决定了药物研发中先导化合物的发现效率。近年来,基于机器学习(ML)的方法备受关注,有望提高传统基于物理方法的预测性能。在本研究中,我们以三个靶点(即JAK1、TAF1-BD2和DDR1)及其相应配体为例,通过比较在不同维度描述符上训练的模型,评估了结构动态信息对结合亲和力预测的影响。这里,二维描述符是传统的ECFP4指纹,三维描述符是Smina和NNscore评分函数的能量项,四维描述符包含基于分子动力学(MD)模拟轨迹得出的结构动态信息。我们系统地研究了三种经典机器学习算法(即随机森林、支持向量回归和极端梯度提升)以及两种常见虚拟筛选方法(即Glide对接和MM/PBSA)的MD优化结合亲和力预测性能。使用各种维度描述符及其组合构建的机器学习模型的结果表明,当以对接构象而非晶体结构作为初始结构时,采用优化方案的MD优化可以提高对具有相当结构灵活性的TAF1-BD2靶点的预测性能,但对灵活性较低的JAK1和DDR1靶点则不然。通过对三个具有不同灵活性的靶点进行构象分析,结果突出了初始结构对模型最终性能的重要性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验