Suppr超能文献

基于来自多个分类等级的同源物的多重序列比对和 AlphaFold2 进行蛋白质复合物结构预测。

Protein complex structure prediction powered by multiple sequence alignments of interologs from multiple taxonomic ranks and AlphaFold2.

机构信息

School of Physics, Huazhong University of Science and Technology, China.

出版信息

Brief Bioinform. 2022 Jul 18;23(4). doi: 10.1093/bib/bbac208.

Abstract

AlphaFold2 can predict protein complex structures as long as a multiple sequence alignment (MSA) of the interologs of the target protein-protein interaction (PPI) can be provided. In this study, a simplified phylogeny-based approach was applied to generate the MSA of interologs, which was then used as the input to AlphaFold2 for protein complex structure prediction. In this extensively benchmarked protocol on nonredundant PPI dataset, including 107 bacterial PPIs and 442 eukaryotic PPIs, we show complex structures of 79.5% of the bacterial PPIs and 49.8% of the eukaryotic PPIs can be successfully predicted, which yielded significantly better performance than the application of MSA of interologs prepared by two existing approaches. Considering PPIs may not be conserved in species with long evolutionary distances, we further restricted interologs in the MSA to different taxonomic ranks of the species of the target PPI in protein complex structure prediction. We found that the success rates can be increased to 87.9% for the bacterial PPIs and 56.3% for the eukaryotic PPIs if interologs in the MSA are restricted to a specific taxonomic rank of the species of each target PPI. Finally, we show that the optimal taxonomic ranks for protein complex structure prediction can be selected with the application of the predicted template modeling (TM) scores of the output models.

摘要

只要提供目标蛋白质-蛋白质相互作用 (PPI) 的同源物的多重序列比对 (MSA),AlphaFold2 就可以预测蛋白质复合物结构。在本研究中,应用了一种简化的基于系统发育的方法来生成同源物的 MSA,然后将其用作 AlphaFold2 进行蛋白质复合物结构预测的输入。在这个广泛基准测试的非冗余 PPI 数据集上,包括 107 个细菌 PPI 和 442 个真核 PPI,我们展示了 79.5%的细菌 PPI 和 49.8%的真核 PPI 的复合物结构可以成功预测,这比应用两种现有方法制备的同源物 MSA 的性能要好得多。考虑到在进化距离较长的物种中 PPI 可能不保守,我们在蛋白质复合物结构预测中进一步将 MSA 中的同源物限制为目标 PPI 物种的不同分类等级。我们发现,如果将 MSA 中的同源物限制为每个目标 PPI 物种的特定分类等级,那么细菌 PPI 的成功率可以提高到 87.9%,真核 PPI 的成功率可以提高到 56.3%。最后,我们展示了可以通过应用输出模型的预测模板建模 (TM) 分数来选择最佳的分类等级进行蛋白质复合物结构预测。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验