Division of Electrical and Computer Engineering, Louisiana State University, Baton Rouge, LA 70803, USA.
Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA.
Biomolecules. 2022 Jul 29;12(8):1053. doi: 10.3390/biom12081053.
The binding of small organic molecules to protein targets is fundamental to a wide array of cellular functions. It is also routinely exploited to develop new therapeutic strategies against a variety of diseases. On that account, the ability to effectively detect and classify ligand binding sites in proteins is of paramount importance to modern structure-based drug discovery. These complex and non-trivial tasks require sophisticated algorithms from the field of artificial intelligence to achieve a high prediction accuracy. In this communication, we describe GraphSite, a deep learning-based method utilizing a graph representation of local protein structures and a state-of-the-art graph neural network to classify ligand binding sites. Using neural weighted message passing layers to effectively capture the structural, physicochemical, and evolutionary characteristics of binding pockets mitigates model overfitting and improves the classification accuracy. Indeed, comprehensive cross-validation benchmarks against a large dataset of binding pockets belonging to 14 diverse functional classes demonstrate that GraphSite yields the class-weighted F1-score of 81.7%, outperforming other approaches such as molecular docking and binding site matching. Further, it also generalizes well to unseen data with the F1-score of 70.7%, which is the expected performance in real-world applications. We also discuss new directions to improve and extend GraphSite in the future.
小分子与蛋白质靶标的结合是广泛的细胞功能的基础。它也经常被用来开发针对各种疾病的新的治疗策略。因此,有效地检测和分类蛋白质中配体结合位点的能力对现代基于结构的药物发现至关重要。这些复杂而棘手的任务需要来自人工智能领域的复杂算法来实现高精度的预测。在本通讯中,我们描述了 GraphSite,这是一种基于深度学习的方法,利用局部蛋白质结构的图表示和最先进的图神经网络来对配体结合位点进行分类。使用神经加权消息传递层来有效地捕获结合口袋的结构、物理化学和进化特征,可以减轻模型过拟合并提高分类准确性。事实上,对属于 14 个不同功能类别的大量结合口袋数据集进行全面的交叉验证基准测试表明,GraphSite 的加权 F1 分数为 81.7%,优于其他方法,如分子对接和结合位点匹配。此外,它在未见数据上的 F1 分数也达到了 70.7%,这是实际应用中的预期性能。我们还讨论了未来改进和扩展 GraphSite 的新方向。