Department of Computer Science and Software Engineering, Auburn University, Auburn, Alabama, United States of America.
Department of Biological Sciences, Auburn University, Auburn, Alabama, United States of America.
PLoS Comput Biol. 2021 Feb 23;17(2):e1008753. doi: 10.1371/journal.pcbi.1008753. eCollection 2021 Feb.
Crystallography and NMR system (CNS) is currently a widely used method for fragment-free ab initio protein folding from inter-residue distance or contact maps. Despite its widespread use in protein structure prediction, CNS is a decade-old macromolecular structure determination system that was originally developed for solving macromolecular geometry from experimental restraints as opposed to predictive modeling driven by interaction map data. As such, the adaptation of the CNS experimental structure determination protocol for ab initio protein folding is intrinsically anomalous that may undermine the folding accuracy of computational protein structure prediction. In this paper, we propose a new CNS-free hierarchical structure modeling method called DConStruct for folding both soluble and membrane proteins driven by distance and contact information. Rigorous experimental validation shows that DConStruct attains much better reconstruction accuracy than CNS when tested with the same input contact map at varying contact thresholds. The hierarchical modeling with iterative self-correction employed in DConStruct scales at a much higher degree of folding accuracy than CNS with the increase in contact thresholds, ultimately approaching near-optimal reconstruction accuracy at higher-thresholded contact maps. The folding accuracy of DConStruct can be further improved by exploiting distance-based hybrid interaction maps at tri-level thresholding, as demonstrated by the better performance of our method in folding free modeling targets from the 12th and 13th rounds of the Critical Assessment of techniques for protein Structure Prediction (CASP) experiments compared to popular CNS- and fragment-based approaches and energy-minimization protocols, some of which even using much finer-grained distance maps than ours. Additional large-scale benchmarking shows that DConStruct can significantly improve the folding accuracy of membrane proteins compared to a CNS-based approach. These results collectively demonstrate the feasibility of greatly improving the accuracy of ab initio protein folding by optimally exploiting the information encoded in inter-residue interaction maps beyond what is possible by CNS.
晶体学和 NMR 系统 (CNS) 目前是一种广泛用于从残基间距离或接触图无片段从头预测蛋白质折叠的方法。尽管 CNS 在蛋白质结构预测中得到了广泛的应用,但它是一个拥有十年历史的大分子结构测定系统,最初是为了根据实验约束来解决大分子几何形状而开发的,而不是根据交互图数据进行预测建模。因此,CNS 实验结构测定方案用于从头预测蛋白质折叠的适应性本质上是异常的,这可能会破坏计算蛋白质结构预测的折叠准确性。在本文中,我们提出了一种新的无 CNS 的层次结构建模方法,称为 DConStruct,用于通过距离和接触信息驱动可溶性和膜蛋白的折叠。严格的实验验证表明,当使用相同的输入接触图在不同的接触阈值下进行测试时,DConStruct 比 CNS 达到了更高的重建准确性。DConStruct 中使用的迭代自校正层次建模在接触阈值增加时具有更高的折叠准确性,最终在更高阈值的接触图上接近接近最优的重建准确性。通过利用三水平阈值的基于距离的混合交互图,DConStruct 的折叠准确性可以进一步提高,这在折叠第 12 和第 13 轮 CASP 实验的无模型目标时,我们的方法的性能优于流行的 CNS 和基于片段的方法以及能量最小化协议,其中一些方法甚至使用比我们更精细的距离图得到了证明。进一步的大规模基准测试表明,与基于 CNS 的方法相比,DConStruct 可以显著提高膜蛋白的折叠准确性。这些结果共同证明了通过最佳利用残基间相互作用图中编码的信息,可以大大提高从头预测蛋白质折叠的准确性,这超出了 CNS 所能实现的范围。