SOKENDAI, The Graduate University for Advanced Studies, 38 Nishigonaka, Myodaiji, Okazaki, 444-8585, Japan.
Institute for Molecular Science, 38 Nishigonaka, Myodaiji, Okazaki, 444-8585, Japan.
BMC Bioinformatics. 2021 Sep 27;22(1):465. doi: 10.1186/s12859-021-04380-5.
The design of protein structures from scratch requires special attention to the combination of the types and lengths of the secondary structures and the loops required to build highly designable backbone structure models. However, it is difficult to predict the combinations that result in globular and protein-like conformations without simulations. In this study, we used single-chain three-helix bundles as simple models of protein tertiary structures and sought to thoroughly investigate the conditions required to construct them, starting from the identification of the typical αα-hairpin motifs.
First, by statistical analysis of naturally occurring protein structures, we identified three αα-hairpins motifs that were specifically related to the left- and right-handedness of helix-helix packing. Second, specifying these αα-hairpins motifs as junctions, we performed sequence-independent backbone-building simulations to comparatively build single-chain three-helix bundle structures and identified the promising combinations of the length of the α-helix and αα-hairpins types that results in tight packing between the first and third α-helices. Third, using those single-chain three-helix bundle backbone structures as template structures, we designed amino acid sequences that were predicted to fold into the target topologies, which supports that the compact single-chain three-helix bundles structures that we sampled show sufficient quality to allow amino-acid sequence design.
The enumeration of the dominant subsets of possible backbone structures for small single-chain three-helical bundle topologies revealed that the compact foldable structures are discontinuously and sparsely distributed in the conformational space. Additionally, although the designs have not been experimentally validated in the present research, the comprehensive set of computational structural models generated also offers protein designers the opportunity to skip building similar structures by themselves and enables them to quickly focus on building specialized designs using the prebuilt structure models. The backbone and best design models in this study are publicly accessible from the following URL: https://doi.org/10.5281/zenodo.4321632 .
从头设计蛋白质结构需要特别注意二级结构的类型和长度的组合,以及构建高度可设计的骨架结构模型所需的环的组合。然而,如果没有模拟,很难预测导致球状和蛋白质样构象的组合。在这项研究中,我们使用单链三螺旋束作为蛋白质三级结构的简单模型,并试图从鉴定典型的αα-发夹基序开始,彻底研究构建它们所需的条件。
首先,通过对天然存在的蛋白质结构的统计分析,我们鉴定了三个与螺旋-螺旋包装的左手性和右手性特别相关的αα-发夹基序。其次,将这些αα-发夹基序指定为接头,我们进行了序列无关的骨架构建模拟,以比较地构建单链三螺旋束结构,并确定了导致第一和第三α-螺旋之间紧密包装的α-螺旋和αα-发夹类型的长度的有希望的组合。第三,使用那些单链三螺旋束骨架结构作为模板结构,我们设计了预测折叠成目标拓扑结构的氨基酸序列,这支持了我们采样的紧凑单链三螺旋束骨架结构具有足够的质量,可以允许氨基酸序列设计。
对小单链三螺旋束拓扑结构的可能骨架结构的主要子集进行枚举表明,折叠结构在构象空间中是不连续的和稀疏分布的。此外,尽管在本研究中设计尚未通过实验验证,但生成的全面的计算结构模型集也为蛋白质设计师提供了跳过自行构建类似结构的机会,并使他们能够快速专注于使用预构建的结构模型构建专门的设计。本研究中的骨架和最佳设计模型可从以下网址获得:https://doi.org/10.5281/zenodo.4321632。