The Hebrew University of Jerusalem, Israel.
BMC Genomics. 2009 Dec 10;10:593. doi: 10.1186/1471-2164-10-593.
The complete proteome of the starlet sea anemone, Nematostella vectensis, provides insights into gene invention dating back to the Cnidarian-Bilaterian ancestor. With the addition of the complete proteomes of Hydra magnipapillata and Monosiga brevicollis, the investigation of proteins having unique features in early metazoan life has become practical. We focused on the properties and the evolutionary trends of tandem repeat (TR) sequences in Cnidaria proteomes.
We found that 11-16% of N. vectensis proteins contain tandem repeats. Most TRs cover 150 amino acid segments that are comprised of basic units of 5-20 amino acids. In total, the N. Vectensis proteome has about 3300 unique TR-units, but only a small fraction of them are shared with H. magnipapillata, M. brevicollis, or mammalian proteomes. The overall abundance of these TRs stands out relative to that of 14 proteomes representing the diversity among eukaryotes and within the metazoan world. TR-units are characterized by a unique composition of amino acids, with cysteine and histidine being over-represented. Structurally, most TR-segments are associated with coiled and disordered regions. Interestingly, 80% of the TR-segments can be read in more than one open reading frame. For over 100 of them, translation of the alternative frames would result in long proteins. Most domain families that are characterized as repeats in eukaryotes are found in the TR-proteomes from Nematostella and Hydra.
While most TR-proteins have originated from prediction tools and are still awaiting experimental validations, supportive evidence exists for hundreds of TR-units in Nematostella. The existence of TR-proteins in early metazoan life may have served as a robust mode for novel genes with previously overlooked structural and functional characteristics.
星状海葵的完整蛋白质组为我们提供了有关回溯到刺胞动物-两侧对称动物祖先的基因发明的见解。随着 Hydra magnipapillata 和 Monosiga brevicollis 的完整蛋白质组的加入,对在早期后生动物生命中具有独特特征的蛋白质的研究变得切实可行。我们专注于刺胞动物蛋白质组中串联重复(TR)序列的性质和进化趋势。
我们发现,11-16%的 N. vectensis 蛋白包含串联重复。大多数 TR 覆盖 150 个氨基酸片段,由 5-20 个氨基酸的基本单位组成。总的来说,N. Vectensis 蛋白质组约有 3300 个独特的 TR 单位,但其中只有一小部分与 H. magnipapillata、M. brevicollis 或哺乳动物蛋白质组共享。这些 TR 的总体丰度相对于代表真核生物多样性和后生动物世界的 14 种蛋白质组的丰度来说非常突出。TR 单位的氨基酸组成独特,半胱氨酸和组氨酸含量过高。结构上,大多数 TR 片段与卷曲和无序区域相关。有趣的是,80%的 TR 片段可以在一个以上的开放阅读框中读取。对于其中的 100 多个片段,翻译替代框架会导致产生长蛋白。在真核生物中被特征化为重复的大多数结构域家族都存在于 Nematostella 和 Hydra 的 TR 蛋白质组中。
虽然大多数 TR 蛋白是从预测工具中预测而来的,并且仍在等待实验验证,但在 Nematostella 中有数百个 TR 单位存在支持证据。TR 蛋白在早期后生动物生命中的存在可能为具有以前未被注意到的结构和功能特征的新型基因提供了一个强大的模式。