Nguyen Ngoc-Quang, Kang Jaewoo
Department of Computer Science and Engineering, Korea University, Seoul 02841, Republic of Korea.
AIGEN Sciences, Seoul 04778, Republic of Korea.
J Chem Inf Model. 2025 Jul 14;65(13):7252-7262. doi: 10.1021/acs.jcim.5c00773. Epub 2025 Jul 2.
Accurate prediction of compound-protein interactions (CPI) remains a cornerstone challenge in computational drug discovery. While existing sequence-based approaches leverage molecular fingerprints or graph representations, they critically overlook the three-dimensional (3D) structural determinants of binding affinity. To bridge this gap, we present EquiCPI, an end-to-end geometric deep learning framework that synergizes first-principles structural modeling with SE(3)-equivariant neural networks. Our pipeline transforms raw sequences into 3D atomic coordinates via ESMFold for proteins and DiffDock-L for ligands, followed by physics-guided conformer reranking and equivariant feature learning. At its core, EquiCPI employs SE(3)-equivariant message passing over atomic point clouds, preserving symmetry under rotations, translations, and reflections, while hierarchically encoding local interaction patterns through tensor products of spherical harmonics. The proposed model is evaluated on BindingDB (affinity prediction) and DUD-E (virtual screening). EquiCPI achieves performance on par with or exceeding the state-of-the-art deep learning competitors.
准确预测化合物与蛋白质的相互作用(CPI)仍然是计算药物发现中的一项核心挑战。虽然现有的基于序列的方法利用分子指纹或图形表示,但它们严重忽略了结合亲和力的三维(3D)结构决定因素。为了弥合这一差距,我们提出了EquiCPI,这是一个端到端的几何深度学习框架,它将第一性原理结构建模与SE(3)等变神经网络相结合。我们的流程通过用于蛋白质的ESMFold和用于配体的DiffDock-L将原始序列转换为3D原子坐标,然后进行物理引导的构象重排和等变特征学习。EquiCPI的核心是在原子点云上采用SE(3)等变消息传递,在旋转、平移和反射下保持对称性,同时通过球谐函数的张量积分层编码局部相互作用模式。所提出的模型在BindingDB(亲和力预测)和DUD-E(虚拟筛选)上进行了评估。EquiCPI的性能与最先进的深度学习竞争对手相当或超过它们。