Wang Menglun, Cang Zixuan, Wei Guo-Wei
Department of Mathematics, Michigan State University, East Lansing, MI USA.
Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI USA.
Nat Mach Intell. 2020;2(2):116-123. doi: 10.1038/s42256-020-0149-6. Epub 2020 Feb 14.
The ability to predict protein-protein interactions is crucial to our understanding of a wide range of biological activities and functions in the human body, and for guiding drug discovery. Despite considerable efforts to develop suitable computational methods, predicting protein-protein interaction binding affinity changes following mutation (ΔΔ) remains a severe challenge. Algebraic topology, a champion in recent worldwide competitions for protein-ligand binding affinity predictions, is a promising approach to simplifying the complexity of biological structures. Here we introduce element- and site-specific persistent homology (a new branch of algebraic topology) to simplify the structural complexity of protein-protein complexes and embed crucial biological information into topological invariants. We also propose a new deep learning algorithm called NetTree to take advantage of convolutional neural networks and gradient-boosting trees. A topology-based network tree is constructed by integrating the topological representation and NetTree for predicting protein-protein interaction ΔΔ. Tests on major benchmark datasets indicate that the proposed topology-based network tree is an important improvement over the current state of the art in predicting ΔΔ.
预测蛋白质-蛋白质相互作用的能力对于我们理解人体中广泛的生物活性和功能以及指导药物发现至关重要。尽管人们付出了巨大努力来开发合适的计算方法,但预测突变后蛋白质-蛋白质相互作用结合亲和力的变化(ΔΔ)仍然是一项严峻的挑战。代数拓扑作为最近全球蛋白质-配体结合亲和力预测竞赛的冠军,是一种简化生物结构复杂性的有前途的方法。在这里,我们引入元素和位点特异性持久同调(代数拓扑的一个新分支)来简化蛋白质-蛋白质复合物的结构复杂性,并将关键的生物信息嵌入到拓扑不变量中。我们还提出了一种名为NetTree的新深度学习算法,以利用卷积神经网络和梯度提升树。通过整合拓扑表示和NetTree构建基于拓扑的网络树,用于预测蛋白质-蛋白质相互作用的ΔΔ。在主要基准数据集上的测试表明,所提出的基于拓扑的网络树在预测ΔΔ方面比当前的技术水平有重要改进。