Jeevan Kandel, Palistha Shrestha, Tayara Hilal, Chong Kil T
Graduate School of Integrated Energy-AI, Jeonbuk National University, Jeonju, 54896, South Korea.
Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju, 54896, South Korea.
J Cheminform. 2024 Jun 7;16(1):66. doi: 10.1186/s13321-024-00865-6.
Accurate ligand binding site prediction (LBSP) within proteins is essential for drug discovery. We developed ProteinUNetResNetV2.0 (PUResNetV2.0), leveraging sparse representation of protein structures to improve LBSP accuracy. Our training dataset included protein complexes from 4729 protein families. Evaluations on benchmark datasets showed that PUResNetV2.0 achieved an 85.4% Distance Center Atom (DCA) success rate and a 74.7% F1 Score on the Holo801 dataset, outperforming existing methods. However, its performance in specific cases, such as RNA, DNA, peptide-like ligand, and ion binding site prediction, was limited due to constraints in our training data. Our findings underscore the potential of sparse representation in LBSP, especially for oligomeric structures, suggesting PUResNetV2.0 as a promising tool for computational drug discovery.
准确预测蛋白质中的配体结合位点(LBSP)对于药物发现至关重要。我们开发了ProteinUNetResNetV2.0(PUResNetV2.0),利用蛋白质结构的稀疏表示来提高LBSP的准确性。我们的训练数据集包括来自4729个蛋白质家族的蛋白质复合物。在基准数据集上的评估表明,PUResNetV2.0在Holo801数据集上实现了85.4%的距离中心原子(DCA)成功率和74.7%的F1分数,优于现有方法。然而,由于我们训练数据的限制,它在特定情况下的性能,如RNA、DNA、肽样配体和离子结合位点预测,受到限制。我们的研究结果强调了稀疏表示在LBSP中的潜力,特别是对于寡聚结构,这表明PUResNetV2.0是计算药物发现的一个有前途的工具。