Han Jiyun, Zhang Shizhuo, Guan Mingming, Li Qiuyu, Gao Xin, Liu Juntao
School of Mathematics and Statistics, Shandong University, Weihai 264209, China.
Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955, Saudi Arabia; Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955, Saudi Arabia.
Structure. 2024 Dec 5;32(12):2435-2448.e5. doi: 10.1016/j.str.2024.10.011. Epub 2024 Nov 1.
The identification of protein binding residues is essential for understanding their functions in vivo. However, it remains a computational challenge to accurately identify binding sites due to the lack of known residue binding patterns. Local residue spatial distribution and its interactive biophysical environment both determine binding patterns. Previous methods could not capture both information simultaneously, resulting in unsatisfactory performance. Here, we present GeoNet, an interpretable geometric deep learning model for predicting DNA, RNA, and protein binding sites by learning the latent residue binding patterns. GeoNet achieves this by introducing a coordinate-free geometric representation to characterize local residue distributions and generating an eigenspace to depict local interactive biophysical environments. Evaluation shows that GeoNet is superior compared to other leading predictors and it shows a strong interpretability of learned representations. We present three test cases, where interaction interfaces were successfully identified with GeoNet.
识别蛋白质结合残基对于理解其在体内的功能至关重要。然而,由于缺乏已知的残基结合模式,准确识别结合位点仍然是一个计算挑战。局部残基空间分布及其相互作用的生物物理环境都决定了结合模式。以前的方法无法同时捕捉这两种信息,导致性能不尽人意。在这里,我们提出了GeoNet,这是一种可解释的几何深度学习模型,通过学习潜在的残基结合模式来预测DNA、RNA和蛋白质结合位点。GeoNet通过引入无坐标几何表示来表征局部残基分布,并生成一个特征空间来描绘局部相互作用的生物物理环境来实现这一目标。评估表明,GeoNet优于其他领先的预测器,并且它对学习到的表示具有很强的可解释性。我们展示了三个测试案例,其中GeoNet成功识别了相互作用界面。