Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA.
Koch Institute for Integrative Cancer Research, Massachusetts Institute of Technology, Cambridge, MA.
Proc Natl Acad Sci U S A. 2024 Nov 12;121(46):e2412948121. doi: 10.1073/pnas.2412948121. Epub 2024 Nov 6.
Collagens are the foundational component of diverse tissues, including skin, bone, cartilage, and basement membranes, and are the most abundant protein class in animals. The fibrillar collagens are large, complex, multidomain proteins, all containing the characteristic triple helix motif. The most prevalent collagens are heterotrimeric, meaning that cells express at least two distinctive procollagen polypeptides that must assemble into specific heterotrimer compositions. The molecular mechanisms ensuring correct heterotrimeric assemblies are poorly understood - even for the most common collagen, type-I. The longstanding paradigm is that assembly is controlled entirely by the ~30 kDa globular C-propeptide (C-Pro) domain. Still, this dominating model for procollagen assembly has left many questions unanswered. Here, we show that the C-Pro paradigm is incomplete. In addition to the critical role of the C-Pro domain in templating assembly, we find that the amino acid sequence near the C terminus of procollagen's triple-helical domain plays an essential role in defining procollagen assembly outcomes. These sequences near the C terminus of the triple-helical domain encode conformationally stabilizing features that ensure only desirable C-Pro-mediated trimeric templates are committed to irreversible triple-helix folding. Incorrect C-Pro trimer assemblies avoid commitment to triple-helix formation thanks to destabilizing features in the amino acid sequences of their triple helix. Incorrect C-Pro assemblies are consequently able to dissociate and search for new binding partners. These findings provide a distinctive perspective on the mechanism of procollagen assembly, revealing the molecular basis by which incorrect homotrimer assemblies are avoided and setting the stage for a deeper understanding of the biogenesis of this ubiquitous protein.
胶原是多种组织的基础成分,包括皮肤、骨骼、软骨和基底膜,是动物中最丰富的蛋白质类别。纤维胶原是大型、复杂、多结构域的蛋白质,都含有特征性的三螺旋结构基元。最常见的胶原是异三聚体,这意味着细胞表达至少两种不同的前胶原多肽,这些多肽必须组装成特定的异三聚体组成。确保正确异三聚体组装的分子机制还了解甚少——即使对于最常见的胶原 I 型也是如此。长期以来的范式是,组装完全由约 30 kDa 的球形 C-前肽 (C-Pro) 结构域控制。尽管如此,这种用于前胶原组装的主导模型仍有许多问题尚未解答。在这里,我们表明 C-Pro 范式并不完整。除了 C-Pro 结构域在模板组装中的关键作用外,我们还发现前胶原三螺旋结构域 C 端附近的氨基酸序列在确定前胶原组装结果方面起着至关重要的作用。这些位于三螺旋结构域 C 端附近的序列编码了构象稳定的特征,这些特征确保只有理想的 C-Pro 介导的三聚体模板被承诺进行不可逆的三螺旋折叠。由于其三螺旋氨基酸序列中的不稳定特征,不正确的 C-Pro 三聚体组装避免了承诺进行三螺旋形成。不正确的 C-Pro 组装因此能够解离并寻找新的结合伙伴。这些发现为前胶原组装的机制提供了一个独特的视角,揭示了避免不正确的同源三聚体组装的分子基础,并为更深入地了解这种普遍存在的蛋白质的生物发生奠定了基础。