Cromar Graham, Wong Ka-Chun, Loughran Noeleen, On Tuan, Song Hongyan, Xiong Xuejian, Zhang Zhaolei, Parkinson John
Program in Molecular Structure and Function, Hospital for Sick Children, Toronto, Ontario, Canada Department of Molecular Genetics, University of Toronto, Ontario, Canada.
Department of Computer Science, University of Toronto, Ontario, Canada Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Ontario, Canada.
Genome Biol Evol. 2014 Oct 15;6(10):2897-917. doi: 10.1093/gbe/evu228.
The extracellular matrix (ECM) is a defining characteristic of metazoans and consists of a meshwork of self-assembling, fibrous proteins, and their functionally related neighbours. Previous studies, focusing on a limited number of gene families, suggest that vertebrate complexity predominantly arose through the duplication and subsequent modification of retained, preexisting ECM genes. These genes provided the structural underpinnings to support a variety of specialized tissues, as well as a platform for the organization of spatio-temporal signaling and cell migration. However, the relative contributions of ancient versus novel domains to ECM evolution have not been quantified across the full range of ECM proteins. Here, utilizing a high quality list comprising 324 ECM genes, we reveal general and clade-specific domain combinations, identifying domains of eukaryotic and metazoan origin recruited into new roles in approximately two-third of the ECM proteins in humans representing novel vertebrate proteins. We show that, rather than acquiring new domains, sampling of new domain combinations has been key to the innovation of paralogous ECM genes during vertebrate evolution. Applying a novel framework for identifying potentially important, noncontiguous, conserved arrangements of domains, we find that the distinct biological characteristics of the ECM have arisen through unique evolutionary processes. These include the preferential recruitment of novel domains to existing architectures and the utilization of high promiscuity domains in organizing the ECM network around a connected array of structural hubs. Our focus on ECM proteins reveals that distinct types of proteins and/or the biological systems in which they operate have influenced the types of evolutionary forces that drive protein innovation. This emphasizes the need for rigorously defined systems to address questions of evolution that focus on specific systems of interacting proteins.
细胞外基质(ECM)是后生动物的一个决定性特征,由自组装的纤维状蛋白质及其功能相关的邻近分子组成的网络构成。以往的研究聚焦于有限数量的基因家族,表明脊椎动物的复杂性主要源于保留的、先前存在的ECM基因的复制及随后的修饰。这些基因提供了支持各种特殊组织的结构基础,以及组织时空信号传导和细胞迁移的平台。然而,在整个ECM蛋白范围内,古老结构域与新结构域对ECM进化的相对贡献尚未得到量化。在这里,我们利用一份包含324个ECM基因的高质量清单,揭示了一般和特定分支的结构域组合,确定了真核生物和后生动物起源的结构域在人类约三分之二的ECM蛋白中被招募到新的角色中,这些蛋白代表了新的脊椎动物蛋白。我们表明,在脊椎动物进化过程中,旁系同源ECM基因创新的关键在于新结构域组合的采样,而非获得新结构域。应用一种用于识别潜在重要的、非连续的、保守的结构域排列的新框架,我们发现ECM独特的生物学特性是通过独特的进化过程产生的。这些过程包括将新结构域优先招募到现有结构中,以及在围绕一系列相连的结构枢纽组织ECM网络时利用高混杂性结构域。我们对ECM蛋白的关注表明,不同类型的蛋白和/或它们所运作的生物系统影响了驱动蛋白创新的进化力量类型。这强调了需要严格定义的系统来解决专注于特定相互作用蛋白系统的进化问题。