Martinez-Goikoetxea Mikel
Department of Protein Evolution, Max Planck Institute for Biology, Tübingen 72076, Germany.
Bioinform Adv. 2024 Dec 6;5(1):vbae195. doi: 10.1093/bioadv/vbae195. eCollection 2025.
Coiled coils are a widespread structural motif consisting of multiple α-helices that wind around a central axis to bury their hydrophobic core. While AlphaFold has emerged as an effective coiled-coil modeling tool, capable of accurately predicting changes in periodicity and core geometry along coiled-coil stalks, it is not without limitations, such as the generation of spuriously bent models and the inability to effectively model globally non-canonical-coiled coils. To overcome these limitations, we investigated whether dividing full-length sequences into fragments would result in better models.
We developed CCfrag to leverage AlphaFold for the piece-wise modeling of coiled coils. The user can create a specification, defined by window size, length of overlap, and oligomerization state, and the program produces the files necessary to run AlphaFold predictions. The structural models and their scores are then integrated into a rich per-residue representation defined by sequence- or structure-based features. Our results suggest that removing coiled-coil sequences from their native context can improve prediction confidence and results in better models. In this article, we present various use cases of CCfrag and propose that fragment-based prediction is useful for understanding the properties of long, fibrous coiled coils by revealing local features not seen in full-length models.
The program is implemented as a Python module. The code and its documentation are available at https://github.com/Mikel-MG/CCfrag.
卷曲螺旋是一种广泛存在的结构基序,由多个α螺旋组成,这些α螺旋围绕中心轴缠绕以掩埋其疏水核心。虽然AlphaFold已成为一种有效的卷曲螺旋建模工具,能够准确预测沿卷曲螺旋柄的周期性和核心几何形状的变化,但它并非没有局限性,例如生成虚假弯曲的模型以及无法有效建模全局非规范卷曲螺旋。为了克服这些局限性,我们研究了将全长序列分成片段是否会产生更好的模型。
我们开发了CCfrag,利用AlphaFold对卷曲螺旋进行逐段建模。用户可以创建由窗口大小、重叠长度和寡聚化状态定义的规范,程序会生成运行AlphaFold预测所需的文件。然后将结构模型及其分数整合到由基于序列或结构的特征定义的丰富的每个残基表示中。我们的结果表明,将卷曲螺旋序列从其天然环境中去除可以提高预测置信度并产生更好的模型。在本文中,我们展示了CCfrag的各种用例,并提出基于片段的预测对于通过揭示全长模型中未看到的局部特征来理解长纤维卷曲螺旋的特性很有用。
该程序作为一个Python模块实现。代码及其文档可在https://github.com/Mikel-MG/CCfrag获取。