Department of Biological Sciences, Purdue University, West Lafayette, Indiana, United States of America.
Department of Computer Science, Purdue University, West Lafayette, Indiana, United States of America.
PLoS Comput Biol. 2018 Jan 12;14(1):e1005937. doi: 10.1371/journal.pcbi.1005937. eCollection 2018 Jan.
Protein-protein interactions are the cornerstone of numerous biological processes. Although an increasing number of protein complex structures have been determined using experimental methods, relatively fewer studies have been performed to determine the assembly order of complexes. In addition to the insights into the molecular mechanisms of biological function provided by the structure of a complex, knowing the assembly order is important for understanding the process of complex formation. Assembly order is also practically useful for constructing subcomplexes as a step toward solving the entire complex experimentally, designing artificial protein complexes, and developing drugs that interrupt a critical step in the complex assembly. There are several experimental methods for determining the assembly order of complexes; however, these techniques are resource-intensive. Here, we present a computational method that predicts the assembly order of protein complexes by building the complex structure. The method, named Path-LzerD, uses a multimeric protein docking algorithm that assembles a protein complex structure from individual subunit structures and predicts assembly order by observing the simulated assembly process of the complex. Benchmarked on a dataset of complexes with experimental evidence of assembly order, Path-LZerD was successful in predicting the assembly pathway for the majority of the cases. Moreover, when compared with a simple approach that infers the assembly path from the buried surface area of subunits in the native complex, Path-LZerD has the strong advantage that it can be used for cases where the complex structure is not known. The path prediction accuracy decreased when starting from unbound monomers, particularly for larger complexes of five or more subunits, for which only a part of the assembly path was correctly identified. As the first method of its kind, Path-LZerD opens a new area of computational protein structure modeling and will be an indispensable approach for studying protein complexes.
蛋白质-蛋白质相互作用是许多生物过程的基础。尽管越来越多的蛋白质复合物结构已经通过实验方法确定,但相对较少的研究用于确定复合物的组装顺序。除了通过复合物的结构提供对生物功能分子机制的了解之外,了解组装顺序对于理解复合物形成过程也很重要。组装顺序对于构建亚复合物也具有实际意义,这是朝着实验解决整个复合物、设计人工蛋白质复合物以及开发中断复合物组装关键步骤的药物的方向迈出的一步。有几种用于确定复合物组装顺序的实验方法;然而,这些技术资源密集。在这里,我们提出了一种通过构建复合物结构来预测蛋白质复合物组装顺序的计算方法。该方法名为 Path-LzerD,使用多聚体蛋白质对接算法,根据单体结构组装蛋白质复合物结构,并通过观察复合物的模拟组装过程来预测组装顺序。在具有组装顺序实验证据的复合物数据集上进行基准测试,Path-LZerD 在大多数情况下成功预测了组装途径。此外,与从天然复合物中亚基的埋藏表面积推断组装路径的简单方法相比,Path-LZerD 的优势在于它可用于复合物结构未知的情况。从未结合的单体开始时,路径预测准确性会降低,特别是对于五个或更多亚基的较大复合物,其中只有一部分组装路径被正确识别。作为同类中的第一种方法,Path-LZerD 开辟了计算蛋白质结构建模的新领域,并且将成为研究蛋白质复合物不可或缺的方法。