Brookes Emre, Curtis Joseph E, Householder Aaron, Rocco Mattia
Department of Chemistry and Biochemistry University of Montana 32 Campus Drive Missoula Montana59812 USA.
NIST Center for Neutron Research National Institute of Standards and Technology Gaithersburg Maryland20878 USA.
J Appl Crystallogr. 2025 May 29;58(Pt 3):1034-1049. doi: 10.1107/S1600576725003590. eCollection 2025 Jun 1.
AI programs such as () are having a major impact on structural biology. However, predicted unstructured regions, the arrangement of linker-connected domains and their conformational changes in response to environmental variables present challenges that are not easily dealt with on purely computational grounds. An approach that uses predicted (or solved) protein modules/domains linked by potentially unstructured regions and that generates ensembles of models optimized against small-angle X-ray scattering (SAXS) data has been recently described [Brookes (2023). , 910-926]. Its implementation on a public-domain website, (https://saxsafold.genapp.rocks), is presented here. User-supplied SAXS experimental intensity () versus scattering vector magnitude and the derived pair-wise distance distribution function () versus are first uploaded. An or user-supplied structure (currently only single chains without prosthetic groups) is then uploaded and displayed, and its SAXS () and () profiles are computed and compared with the experimental data. If uploaded from , the structure is color-coded by the associated confidence level: on this basis, the website automatically proposes potential flexible regions that can be user modified. For user-supplied structures, these regions have to be directly entered. A starting pool of typically 10-50 × 10 conformations is generated using a Monte Carlo method that samples backbone dihedral angles along the chosen segments of potential flexibility in the protein structures. The initial pool is reduced to obtain a tractable set of models, for which () and () are computed with fast established methods. A global fit is performed using non-negatively constrained least-squares (NNLS) versus original data. The () and () NNLS results are then displayed, showing both the reconstructed curves and the contributing model curves, with their percentage contributions. A (https://waxsis.uni-saarland.de) implementation is utilized to calculate an () for each selected model. These sets can be enhanced by adding a user-defined number of models generated before and after each selected model in the original Monte Carlo pool, ensuring the inclusion of nearby models that might better fit the data. Finally, NNLS is used on the -generated () set versus the original () data, with the results displaying the contributing models and their (). Aside from being representative of contributing conformations, the models selected by could constitute a set of starting structures for more advanced MD simulations.
诸如()之类的人工智能程序正在对结构生物学产生重大影响。然而,预测的无结构区域、连接子连接结构域的排列及其响应环境变量的构象变化带来了一些挑战,这些挑战仅基于计算方法难以应对。最近描述了一种方法,该方法使用由潜在无结构区域连接的预测(或已解析)蛋白质模块/结构域,并生成针对小角X射线散射(SAXS)数据优化的模型集合[布鲁克斯(2023年)。,910 - 926]。本文介绍了其在公共领域网站(https://saxsafold.genapp.rocks)上的实现。首先上传用户提供的SAXS实验强度()与散射矢量大小以及派生的成对距离分布函数()与的关系。然后上传并显示一个或用户提供的结构(目前仅无辅基的单链),并计算其SAXS()和()轮廓,并与实验数据进行比较。如果从上传,结构会根据相关置信水平进行颜色编码:在此基础上,网站会自动提出潜在的灵活区域,用户可以进行修改。对于用户提供的结构,这些区域必须直接输入。使用蒙特卡罗方法生成一个通常包含10 - 50×10个构象的起始池,该方法沿着蛋白质结构中潜在灵活性的选定片段对主链二面角进行采样。初始池被缩减以获得一组易于处理的模型,使用快速建立的方法计算其()和()。使用非负约束最小二乘法(NNLS)对原始数据进行全局拟合。然后显示()和()的NNLS结果,显示重建曲线和贡献模型曲线以及它们的百分比贡献。利用(https://waxsis.uni - saarland.de)的实现为每个选定模型计算一个()。通过在原始蒙特卡罗池中每个选定模型之前和之后添加用户定义数量的生成模型,可以增强这些集合,确保包含可能更适合数据的附近模型。最后,对生成的()集与原始()数据使用NNLS,结果显示贡献模型及其()。除了代表贡献构象外,通过选择的模型可以构成一组用于更高级分子动力学模拟的起始结构。