Liu Jian, Neupane Pawan, Cheng Jianlin
Department of Electrical Engineering and Computer Science, NextGen Precision Health, University of Missouri, Columbia, MO 65211, USA.
bioRxiv. 2025 Jan 16:2025.01.12.632663. doi: 10.1101/2025.01.12.632663.
Protein structure prediction methods require stoichiometry information (i.e., subunit counts) to predict the quaternary structure of protein complexes. However, this information is often unavailable, making stoichiometry prediction crucial for complexes with unknown stoichiometry. Despite its importance, few computational methods address this challenge. In this study, we present an approach that integrates AlphaFold3 structure predictions with homologous template data to predict stoichiometry. The method generates candidate stoichiometries, builds structural models for them using AlphaFold3, ranks them based on AlphaFold3 scores, and further refine predictions with template-based information when available. In the 16th community-wide Critical Assessment of Techniques for Protein Structure Prediction (CASP16), our method achieved 71.4% top-1 accuracy and 92.9% top-3 accuracy, outperforming other predictors in terms of the overall performance. This demonstrates the complementary strengths of AlphaFold3- and template-based predictions and highlights its applicability for uncharacterized protein complexes lacking stoichiometry data.
蛋白质结构预测方法需要化学计量信息(即亚基数量)来预测蛋白质复合物的四级结构。然而,此类信息往往难以获取,这使得化学计量预测对于化学计量未知的复合物至关重要。尽管其很重要,但很少有计算方法能应对这一挑战。在本研究中,我们提出了一种将AlphaFold3结构预测与同源模板数据相结合的方法来预测化学计量。该方法生成候选化学计量,使用AlphaFold3为它们构建结构模型,根据AlphaFold3分数对其进行排名,并在有可用的基于模板的信息时进一步完善预测。在第16届全社区蛋白质结构预测技术关键评估(CASP16)中,我们的方法实现了71.4%的top-1准确率和92.9%的top-3准确率,在整体性能方面优于其他预测器。这证明了基于AlphaFold3和基于模板的预测的互补优势,并突出了其对缺乏化学计量数据的未表征蛋白质复合物的适用性。