Molecular Biophysics and Integrated Bioimaging, Lawrence Berkeley National Laboratory, Berkeley, California.
Department of Chemistry and Biochemistry, University of California Santa Cruz, Santa Cruz, California.
Proteins. 2019 Dec;87(12):1298-1314. doi: 10.1002/prot.25827. Epub 2019 Oct 16.
Small angle X-ray scattering (SAXS) measures comprehensive distance information on a protein's structure, which can constrain and guide computational structure prediction algorithms. Here, we evaluate structure predictions of 11 monomeric and oligomeric proteins for which SAXS data were collected and provided to predictors in the 13th round of the Critical Assessment of protein Structure Prediction (CASP13). The category for SAXS-assisted predictions made gains in certain areas for CASP13 compared to CASP12. Improvements included higher quality data with size exclusion chromatography-SAXS (SEC-SAXS) and better selection of targets and communication of results by CASP organizers. In several cases, we can track improvements in model accuracy with use of SAXS data. For hard multimeric targets where regular folding algorithms were unsuccessful, SAXS data helped predictors to build models better resembling the global shape of the target. For most models, however, no significant improvement in model accuracy at the domain level was registered from use of SAXS data, when rigorously comparing SAXS-assisted models to the best regular server predictions. To promote future progress in this category, we identify successes, challenges, and opportunities for improved strategies in prediction, assessment, and communication of SAXS data to predictors. An important observation is that, for many targets, SAXS data were inconsistent with crystal structures, suggesting that these proteins adopt different conformation(s) in solution. This CASP13 result, if representative of PDB structures and future CASP targets, may have substantive implications for the structure training databases used for machine learning, CASP, and use of prediction models for biology.
小角 X 射线散射(SAXS)测量蛋白质结构的综合距离信息,可约束和指导计算结构预测算法。在这里,我们评估了 11 个单体和寡聚蛋白的结构预测,这些蛋白的 SAXS 数据在第 13 轮蛋白质结构预测关键评估(CASP13)中被提供给预测者。与 CASP12 相比,在 CASP13 中,SAXS 辅助预测在某些方面取得了进展。改进包括具有尺寸排阻色谱-SAXS(SEC-SAXS)的更高质量的数据,以及由 CASP 组织者更好地选择目标和交流结果。在某些情况下,我们可以追踪使用 SAXS 数据对模型准确性的改进。对于常规折叠算法不成功的硬多聚体目标,SAXS 数据帮助预测者更好地构建模型,使其更接近目标的整体形状。然而,对于大多数模型,在严格比较 SAXS 辅助模型与最佳常规服务器预测时,从使用 SAXS 数据来看,在域级别上,模型准确性没有显著提高。为了促进这一领域的未来发展,我们确定了在预测、评估和向预测者传达 SAXS 数据方面的成功、挑战和改进策略的机会。一个重要的观察结果是,对于许多目标,SAXS 数据与晶体结构不一致,这表明这些蛋白质在溶液中采用不同的构象。如果这一 CASP13 结果代表 PDB 结构和未来的 CASP 目标,那么它可能对机器学习、CASP 中使用的结构训练数据库以及预测模型在生物学中的应用产生实质性影响。