State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China.
College of Chemistry, Zhengzhou University, Zhengzhou 450001, P. R. China.
J Chem Inf Model. 2024 Apr 8;64(7):2263-2274. doi: 10.1021/acs.jcim.3c00567. Epub 2023 Jul 11.
Water network rearrangement from the ligand-unbound state to the ligand-bound state is known to have significant effects on the protein-ligand binding interactions, but most of the current machine learning-based scoring functions overlook these effects. In this study, we endeavor to construct a comprehensive and realistic deep learning model by incorporating water network information into both ligand-unbound and -bound states. In particular, extended connectivity interaction features were integrated into graph representation, and graph transformer operator was employed to extract features of the ligand-unbound and -bound states. Through these efforts, we developed a water network-augmented two-state model called ECIFGraph::HM-Holo-Apo. Our new model exhibits satisfactory performance in terms of scoring, ranking, docking, screening, and reverse screening power tests on the CASF-2016 benchmark. In addition, it can achieve superior performance in large-scale docking-based virtual screening tests on the DEKOIS2.0 data set. Our study highlights that the use of a water network-augmented two-state model can be an effective strategy to bolster the robustness and applicability of machine learning-based scoring functions, particularly for targets with hydrophilic or solvent-exposed binding pockets.
从配体非结合状态到配体结合状态的水网络重排已知对蛋白质-配体结合相互作用有重大影响,但目前大多数基于机器学习的打分函数都忽略了这些影响。在这项研究中,我们努力通过将水网络信息纳入配体非结合和结合状态来构建一个全面而现实的深度学习模型。具体来说,扩展连通性交互特征被整合到图表示中,并且图转换器算子被用于提取配体非结合和结合状态的特征。通过这些努力,我们开发了一种名为 ECIFGraph::HM-Holo-Apo 的水网络增强两态模型。我们的新模型在 CASF-2016 基准测试的打分、排序、对接、筛选和反向筛选能力测试方面表现出令人满意的性能。此外,它可以在 DEKOIS2.0 数据集上的大规模基于对接的虚拟筛选测试中实现卓越的性能。我们的研究表明,使用水网络增强的两态模型可以是增强基于机器学习的打分函数的稳健性和适用性的有效策略,特别是对于具有亲水性或溶剂暴露结合口袋的靶标。