Brookes Emre, Rocco Mattia, Vachette Patrice, Trewhella Jill
Department of Chemistry and Biochemistry, University of Montana, 32 Campus Drive, Missoula, MT 59812, USA.
Proteomica e Spettrometria di Massa, IRCCS Ospedale Policlinico San Martino, Largo R. Benzi 10, Genova 16132, Italy.
J Appl Crystallogr. 2023 Jul 20;56(Pt 4):910-926. doi: 10.1107/S1600576723005344. eCollection 2023 Aug 1.
By providing predicted protein structures from nearly all known protein sequences, the artificial intelligence program AlphaFold (AF) is having a major impact on structural biology. While a stunning accuracy has been achieved for many folding units, predicted unstructured regions and the arrangement of potentially flexible linkers connecting structured domains present challenges. Focusing on single-chain structures without prosthetic groups, an earlier comparison of features derived from small-angle X-ray scattering (SAXS) data taken from the Small-Angle Scattering Biological Data Bank (SASBDB) is extended to those calculated using the corresponding AF-predicted structures. Selected SASBDB entries were carefully examined to ensure that they represented data from monodisperse protein solutions and had sufficient statistical precision and resolution for reliable structural evaluation. Three examples were identified where there is clear evidence that the single AF-predicted structure cannot account for the experimental SAXS data. Instead, excellent agreement is found with ensemble models generated by allowing for flexible linkers between high-confidence predicted structured domains. A pool of representative structures was generated using a Monte Carlo method that adjusts backbone dihedral allowed angles along potentially flexible regions. A fast ensemble modelling method was employed that optimizes the fit of pair distance distribution functions [() versus ] and intensity profiles [() versus ] computed from the pool to their experimental counterparts. These results highlight the complementarity between AF prediction, solution SAXS and molecular dynamics/conformational sampling for structural modelling of proteins having both structured and flexible regions.
通过从几乎所有已知蛋白质序列中提供预测的蛋白质结构,人工智能程序AlphaFold(AF)正在对结构生物学产生重大影响。虽然许多折叠单元已经实现了惊人的准确性,但预测的无结构区域以及连接结构化结构域的潜在柔性接头的排列带来了挑战。聚焦于没有辅基的单链结构,早期对从小角X射线散射(SAXS)生物数据库(SASBDB)获取的SAXS数据所衍生特征的比较,扩展到了使用相应AF预测结构计算得到的特征。对选定的SASBDB条目进行了仔细检查,以确保它们代表来自单分散蛋白质溶液的数据,并且具有足够的统计精度和分辨率用于可靠的结构评估。确定了三个例子,其中有明确证据表明单个AF预测结构无法解释实验性SAXS数据。相反,在通过考虑高可信度预测的结构化结构域之间的柔性接头生成的集合模型中发现了极好的一致性。使用蒙特卡罗方法生成了一组代表性结构,该方法沿着潜在的柔性区域调整主链二面角允许角度。采用了一种快速集合建模方法,该方法优化了从该集合计算得到的对距离分布函数[()对]和强度分布[()对]与其实验对应物的拟合。这些结果突出了AF预测、溶液SAXS以及分子动力学/构象采样在具有结构化和柔性区域的蛋白质结构建模方面的互补性。