Yadahalli Shilpa, Jayanthi Lakshmi P, Gosavi Shachi
Simons Centre for the Study of Living Machines, National Centre for Biological Sciences, Tata Institute of Fundamental Research, Bangalore, India.
Front Mol Biosci. 2022 Jun 27;9:849272. doi: 10.3389/fmolb.2022.849272. eCollection 2022.
Many single-domain proteins are not only stable and water-soluble, but they also populate few to no intermediates during folding. This reduces interactions between partially folded proteins, misfolding, and aggregation, and makes the proteins tractable in biotechnological applications. Natural proteins fold thus, not necessarily only because their structures are well-suited for folding, but because their sequences optimize packing and fit their structures well. In contrast, folding experiments on the designed Top7 suggest that it populates several intermediates. Additionally, in protein design, where sequences are designed for natural and new non-natural structures, tens of sequences still need to be tested before success is achieved. Both these issues may be caused by the specific scaffolds used in design, i.e., some protein scaffolds may be more tolerant to packing perturbations and varied sequences. Here, we report a computational method for assessing the response of protein structures to packing perturbations. We then benchmark this method using designed proteins and find that it can identify scaffolds whose folding gets disrupted upon perturbing packing, leading to the population of intermediates. The method can also isolate regions of both natural and designed scaffolds that are sensitive to such perturbations and identify contacts which when present can rescue folding. Overall, this method can be used to identify protein scaffolds that are more amenable to whole protein design as well as to identify protein regions which are sensitive to perturbations and where further mutations should be avoided during protein engineering.
许多单结构域蛋白不仅稳定且水溶性好,而且在折叠过程中几乎不形成或不形成中间体。这减少了部分折叠蛋白之间的相互作用、错误折叠和聚集,使这些蛋白在生物技术应用中易于处理。天然蛋白之所以这样折叠,不一定仅仅是因为它们的结构适合折叠,还因为它们的序列优化了堆积并与结构匹配良好。相比之下,对设计的Top7进行的折叠实验表明它会形成几种中间体。此外,在蛋白质设计中,为天然和新的非天然结构设计序列时,在成功之前仍需要测试数十个序列。这两个问题可能都是由设计中使用的特定支架引起的,即某些蛋白质支架可能对堆积扰动和不同序列更具耐受性。在此,我们报告了一种评估蛋白质结构对堆积扰动响应的计算方法。然后我们使用设计的蛋白质对该方法进行基准测试,发现它可以识别出那些在堆积受到扰动时折叠被破坏从而导致中间体形成的支架。该方法还可以分离天然和设计支架中对这种扰动敏感的区域,并识别出存在时可以挽救折叠的接触点。总体而言,该方法可用于识别更适合全蛋白设计的蛋白质支架,以及识别对扰动敏感且在蛋白质工程过程中应避免进一步突变的蛋白质区域。