Center for the Study of Systems Biology, School of Biological Sciences, Atlanta, GA, USA.
School of Computer Science, Georgia Institute of Technology, Atlanta, GA, USA.
Nat Commun. 2022 Apr 1;13(1):1744. doi: 10.1038/s41467-022-29394-2.
Accurate descriptions of protein-protein interactions are essential for understanding biological systems. Remarkably accurate atomic structures have been recently computed for individual proteins by AlphaFold2 (AF2). Here, we demonstrate that the same neural network models from AF2 developed for single protein sequences can be adapted to predict the structures of multimeric protein complexes without retraining. In contrast to common approaches, our method, AF2Complex, does not require paired multiple sequence alignments. It achieves higher accuracy than some complex protein-protein docking strategies and provides a significant improvement over AF-Multimer, a development of AlphaFold for multimeric proteins. Moreover, we introduce metrics for predicting direct protein-protein interactions between arbitrary protein pairs and validate AF2Complex on some challenging benchmark sets and the E. coli proteome. Lastly, using the cytochrome c biogenesis system I as an example, we present high-confidence models of three sought-after assemblies formed by eight members of this system.
准确描述蛋白质-蛋白质相互作用对于理解生物系统至关重要。最近,AlphaFold2 (AF2) 已经成功计算出单个蛋白质的高精度原子结构。在这里,我们证明,来自 AF2 的针对单个蛋白质序列的相同神经网络模型可以通过调整而无需重新训练来预测多聚体蛋白质复合物的结构。与常见方法不同,我们的方法 AF2Complex 不需要配对的多序列比对。它比一些复杂的蛋白质-蛋白质对接策略具有更高的准确性,并比用于多聚体蛋白质的 AlphaFold 的开发版本 AF-Multimer 有显著的改进。此外,我们引入了用于预测任意蛋白质对之间直接蛋白质-蛋白质相互作用的指标,并在一些具有挑战性的基准数据集和大肠杆菌蛋白质组上对 AF2Complex 进行了验证。最后,我们以细胞色素 c 生物发生系统 I 为例,展示了由该系统的 8 个成员组成的三个备受关注的组装体的高可信度模型。