Du Hongyan, Jiang Dejun, Gao Junbo, Zhang Xujun, Jiang Lingxiao, Zeng Yundian, Wu Zhenxing, Shen Chao, Xu Lei, Cao Dongsheng, Hou Tingjun, Pan Peichen
Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058 Zhejiang, China.
State Key Lab of CAD&CG, Zhejiang University, Hangzhou, 310058 Zhejiang, China.
Research (Wash D C). 2022 Jul 21;2022:9873564. doi: 10.34133/2022/9873564. eCollection 2022.
Covalent ligands have attracted increasing attention due to their unique advantages, such as long residence time, high selectivity, and strong binding affinity. They also show promise for targets where previous efforts to identify noncovalent small molecule inhibitors have failed. However, our limited knowledge of covalent binding sites has hindered the discovery of novel ligands. Therefore, developing in silico methods to identify covalent binding sites is highly desirable. Here, we propose DeepCoSI, the first structure-based deep graph learning model to identify ligandable covalent sites in the protein. By integrating the characterization of the binding pocket and the interactions between each cysteine and the surrounding environment, DeepCoSI achieves state-of-the-art predictive performances. The validation on two external test sets which mimic the real application scenarios shows that DeepCoSI has strong ability to distinguish ligandable sites from the others. Finally, we profiled the entire set of protein structures in the RCSB Protein Data Bank (PDB) with DeepCoSI to evaluate the ligandability of each cysteine for covalent ligand design, and made the predicted data publicly available on website.
共价配体因其独特优势,如长驻留时间、高选择性和强结合亲和力,而受到越来越多的关注。对于之前鉴定非共价小分子抑制剂失败的靶点,它们也展现出了潜力。然而,我们对共价结合位点的了解有限,这阻碍了新型配体的发现。因此,开发用于鉴定共价结合位点的计算机方法非常必要。在此,我们提出了DeepCoSI,这是首个基于结构的深度图学习模型,用于识别蛋白质中可与配体结合的共价位点。通过整合结合口袋的特征以及每个半胱氨酸与周围环境之间的相互作用,DeepCoSI实现了领先的预测性能。在模拟实际应用场景的两个外部测试集上的验证表明,DeepCoSI具有很强的能力将可与配体结合的位点与其他位点区分开来。最后,我们使用DeepCoSI对RCSB蛋白质数据库(PDB)中的整个蛋白质结构集进行了分析,以评估每个半胱氨酸用于共价配体设计的可配体性,并将预测数据在网站上公开。