Laboratory of Structural Bioinformatics, Centre of New Technologies, University of Warsaw, 02-097 Warsaw, Poland.
Laboratory of Structural Bioinformatics, Centre of New Technologies, University of Warsaw, 02-097 Warsaw, Poland; Laboratory of Bioinformatics, Nencki Institute of Experimental Biology, Pasteura 3, 02-093 Warsaw, Poland.
J Struct Biol. 2018 Oct;204(1):117-124. doi: 10.1016/j.jsb.2018.07.002. Epub 2018 Jul 2.
In protein modelling and design, an understanding of the relationship between sequence and structure is essential. Using parallel, homotetrameric coiled-coil structures as a model system, we demonstrated that machine learning techniques can be used to predict structural parameters directly from the sequence. Coiled coils are regular protein structures, which are of great interest as building blocks for assembling larger nanostructures. They are composed of two or more alpha-helices wrapped around each other to form a supercoiled bundle. The coiled-coil bundles are defined by four basic structural parameters: topology (parallel or antiparallel), radius, degree of supercoiling, and the rotation of helices around their axes. In parallel coiled coils the latter parameter, describing the hydrophobic core packing geometry, was assumed to show little variation. However, we found that subtle differences between structures of this type were not artifacts of structure determination and could be predicted directly from the sequence. Using this information in modelling narrows the structural parameter space that must be searched and thus significantly reduces the required computational time. Moreover, the sequence-structure rules can be used to explain the effects of point mutations and to shed light on the relationship between hydrophobic core architecture and coiled-coil topology.
在蛋白质建模和设计中,理解序列和结构之间的关系至关重要。我们使用平行的同源四聚体卷曲螺旋结构作为模型系统,证明了机器学习技术可以直接从序列中预测结构参数。卷曲螺旋是规则的蛋白质结构,作为组装更大纳米结构的构建块,它们具有很大的研究兴趣。它们由两个或更多的α-螺旋相互缠绕形成超螺旋束。卷曲螺旋束由四个基本结构参数定义:拓扑(平行或反平行)、半径、超螺旋程度和螺旋围绕其轴的旋转。在平行卷曲螺旋中,后一个参数描述了疏水性核心包装几何形状,被认为变化不大。然而,我们发现这种类型的结构之间的细微差异不是结构测定的假象,并且可以直接从序列中预测。在建模中使用这些信息可以缩小必须搜索的结构参数空间,从而大大减少所需的计算时间。此外,序列-结构规则可用于解释点突变的影响,并阐明疏水性核心结构与卷曲螺旋拓扑之间的关系。