Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, P.R. China.
School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P.R. China.
Bioinformatics. 2022 Apr 28;38(9):2444-2451. doi: 10.1093/bioinformatics/btac120.
Protein-protein interactions (PPI) play important roles in cellular activities. Due to the technical difficulty and high cost of experimental methods, there are considerable interests towards the development of computational approaches, such as protein docking, to decipher PPI patterns. One of the important and difficult aspects in protein docking is recognizing near-native conformations from a set of decoys, but unfortunately, traditional scoring functions still suffer from limited accuracy. Therefore, new scoring methods are pressingly needed in methodological and/or practical implications.
We present a new deep learning-based scoring method for ranking protein-protein docking models based on a 3D RepVGG network, named TRScore. To recognize near-native conformations from a set of decoys, TRScore voxelizes the protein-protein interface into a 3D grid labeled by the number of atoms in different physicochemical classes. Benefiting from the deep convolutional RepVGG architecture, TRScore can effectively capture the subtle differences between energetically favorable near-native models and unfavorable non-native decoys without needing extra information. TRScore was extensively evaluated on diverse test sets including protein-protein docking benchmark 5.0 update set, DockGround decoy set, as well as realistic CAPRI decoy set and overall obtained a significant improvement over existing methods in cross-validation and independent evaluations.
Codes available at: https://github.com/BioinformaticsCSU/TRScore.
蛋白质-蛋白质相互作用(PPI)在细胞活动中起着重要作用。由于实验方法的技术难度和高成本,人们对开发计算方法(如蛋白质对接)来破译 PPI 模式产生了浓厚的兴趣。蛋白质对接中的一个重要而困难的方面是从一组诱饵中识别近天然构象,但不幸的是,传统的评分函数仍然受到准确性的限制。因此,在方法学和/或实际意义上迫切需要新的评分方法。
我们提出了一种新的基于深度学习的打分方法,用于根据 3D RepVGG 网络对蛋白质-蛋白质对接模型进行排名,称为 TRScore。为了从一组诱饵中识别近天然构象,TRScore 将蛋白质-蛋白质界面体素化为 3D 网格,网格由不同物理化学类别的原子数量标记。受益于深度卷积 RepVGG 架构,TRScore 可以在不需要额外信息的情况下,有效地捕捉能量有利的近天然模型和不利的非天然诱饵之间的细微差异。TRScore 在各种测试集上进行了广泛评估,包括蛋白质-蛋白质对接基准 5.0 更新集、DockGround 诱饵集,以及现实的 CAPRI 诱饵集,在交叉验证和独立评估中均优于现有方法。