Department of Immunology, Genetics and Pathology, The Rudbeck Laboratory, Uppsala University, Uppsala, Sweden.
Department of Life Sciences, University of Siena, Siena, Italy.
BMC Genomics. 2022 Jun 4;23(1):420. doi: 10.1186/s12864-022-08659-6.
The group XIV of C-type lectin domain-containing proteins (CTLDcps) is one of the seventeen groups of CTLDcps discovered in mammals and composed by four members: CD93, Clec14A, CD248 and Thrombomodulin, which have shown to be important players in cancer and vascular biology. Although these proteins belong to the same family, their phylogenetic relationship has never been dissected. To resolve their evolution and characterize their protein domain composition we investigated CTLDcp genes in gnathostomes and cyclostomes and, by means of phylogenetic approaches as well as synteny analyses, we inferred an evolutionary scheme that attempts to unravel their evolution in modern vertebrates.
Here, we evidenced the paralogy of the group XIV of CTLDcps in gnathostomes and discovered that a gene loss of CD248 and Clec14A occurred in different vertebrate groups, with CD248 being lost due to chromosome disruption in birds, while Clec14A loss in monotremes and marsupials did not involve chromosome rearrangements. Moreover, employing genome annotations of different lampreys as well as one hagfish species, we investigated the origin and evolution of modern group XIV of CTLDcps. Furthermore, we carefully retrieved and annotated gnathostome CTLDcp domains, pointed out important differences in domain composition between gnathostome classes, and assessed codon substitution rate of each domain by analyzing nonsynonymous (Ka) over synonymous (Ks) substitutions using one representative species per gnathostome order.
CTLDcps appeared with the advent of early vertebrates after a whole genome duplication followed by a sporadic tandem duplication. These duplication events gave rise to three CTLDcps in the ancestral vertebrate that underwent further duplications caused by the independent polyploidizations that characterized the evolution of cyclostomes and gnathostomes. Importantly, our analyses of CTLDcps in gnathostomes revealed critical inter-class differences in both extracellular and intracellular domains, which might help the interpretation of experimental results and the understanding of differences between animal models.
C 型凝集素结构域包含蛋白(CTLDcps)第 XIV 组是在哺乳动物中发现的十七个 CTLDcps 组之一,由四个成员组成:CD93、Clec14A、CD248 和血栓调节蛋白,它们已被证明是癌症和血管生物学中的重要参与者。尽管这些蛋白质属于同一家族,但它们的系统发育关系从未被剖析过。为了解决它们的进化问题并描述它们的蛋白质结构域组成,我们研究了颌口类和圆口类的 CTLDcp 基因,并通过系统发育方法和基因排列分析,推断了一个试图阐明现代脊椎动物中它们进化的方案。
在这里,我们证明了颌口类动物 CTLDcps 第 XIV 组的基因并发现 CD248 和 Clec14A 基因在不同的脊椎动物群体中发生了基因丢失,鸟类的 CD248 基因丢失是由于染色体断裂,而单孔类动物和有袋类动物的 Clec14A 丢失则不涉及染色体重排。此外,我们利用不同七鳃鳗和一种盲鳗的基因组注释,研究了现代 CTLDcps 第 XIV 组的起源和进化。此外,我们仔细检索和注释了颌口类动物的 CTLDcp 结构域,指出了颌口类动物类群之间结构域组成的重要差异,并通过分析每个结构域的非同义(Ka)与同义(Ks)取代率,评估了每个结构域的密码子替换率,使用每个颌口类动物目代表一个物种。
CTLDcps 是在早期脊椎动物出现后,经历了全基因组复制和散在串联复制后出现的。这些复制事件导致祖先脊椎动物中出现了三个 CTLDcps,进一步的复制是由有颌类和无颌类进化过程中的独立多倍化引起的。重要的是,我们对颌口类动物 CTLDcps 的分析揭示了细胞外和细胞内结构域之间的重要类间差异,这可能有助于解释实验结果并理解动物模型之间的差异。