Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia.
School of Chemistry and Molecular Biosciences, University of Queensland, Brisbane, Queensland, Australia.
Nucleic Acids Res. 2022 Jul 5;50(W1):W204-W209. doi: 10.1093/nar/gkac381.
Recent advances in protein structural modelling have enabled the accurate prediction of the holo 3D structures of almost any protein, however protein function is intrinsically linked to the interactions it makes. While a number of computational approaches have been proposed to explore potential biological interactions, they have been limited to specific interactions, and have not been readily accessible for non-experts or use in bioinformatics pipelines. Here we present CSM-Potential, a geometric deep learning approach to identify regions of a protein surface that are likely to mediate protein-protein and protein-ligand interactions in order to provide a link between 3D structure and biological function. Our method has shown robust performance, outperforming existing methods for both predictive tasks. By assessing the performance of CSM-Potential on independent blind tests, we show that our method was able to achieve ROC AUC values of up to 0.81 for the identification of potential protein-protein binding sites, and up to 0.96 accuracy on biological ligand classification. Our method is freely available as a user-friendly and easy-to-use web server and API at http://biosig.unimelb.edu.au/csm_potential.
近年来,蛋白质结构建模的进展使得几乎任何蛋白质的全三维结构都可以被准确预测,然而蛋白质的功能本质上与其相互作用有关。虽然已经提出了许多计算方法来探索潜在的生物相互作用,但它们仅限于特定的相互作用,并且对于非专家来说不易使用,也不能用于生物信息学管道。在这里,我们提出了 CSM-Potential,这是一种几何深度学习方法,用于识别蛋白质表面的区域,这些区域可能介导蛋白质-蛋白质和蛋白质-配体相互作用,从而在 3D 结构和生物功能之间建立联系。我们的方法表现出了稳健的性能,在预测任务上优于现有的方法。通过在独立的盲测中评估 CSM-Potential 的性能,我们表明我们的方法能够达到高达 0.81 的 ROC AUC 值,用于识别潜在的蛋白质-蛋白质结合位点,并且在生物配体分类上的准确率高达 0.96。我们的方法可免费作为用户友好且易于使用的网络服务器和 API 使用,网址为 http://biosig.unimelb.edu.au/csm_potential。