Zhang Yan, Fomenko Dmitri E, Gladyshev Vadim N
Department of Biochemistry, University of Nebraska, Lincoln, NE 68588-0664, USA.
Genome Biol. 2005;6(4):R37. doi: 10.1186/gb-2005-6-4-r37. Epub 2005 Mar 29.
Selenocysteine (Sec) is a rare amino acid which occurs in proteins in major domains of life. It is encoded by TGA, which also serves as the signal for termination of translation, precluding identification of selenoprotein genes by available annotation tools. Information on full sets of selenoproteins (selenoproteomes) is essential for understanding the biology of selenium. Herein, we characterized the selenoproteome of the largest microbial sequence dataset, the Sargasso Sea environmental genome project.
We identified 310 selenoprotein genes that clustered into 25 families, including 101 new selenoprotein genes that belonged to 15 families. Most of these proteins were predicted redox proteins containing catalytic selenocysteines. Several bacterial selenoproteins previously thought to be restricted to eukaryotes were detected by analyzing eukaryotic and bacterial SECIS elements, suggesting that eukaryotic and bacterial selenoprotein sets partially overlapped. The Sargasso Sea microbial selenoproteome was rich in selenoproteins and its composition was different from that observed in the combined set of completely sequenced genomes, suggesting that these genomes do not accurately represent the microbial selenoproteome. Most detected selenoproteins occurred sporadically compared to the widespread presence of their cysteine homologs, suggesting that many selenoproteins recently evolved from cysteine-containing homologs.
This study yielded the largest selenoprotein dataset to date, doubled the number of prokaryotic selenoprotein families and provided insights into forces that drive selenocysteine evolution.
硒代半胱氨酸(Sec)是一种罕见的氨基酸,存在于生命主要领域的蛋白质中。它由TGA编码,而TGA也用作翻译终止信号,这使得现有注释工具无法识别硒蛋白基因。关于完整硒蛋白组的信息对于理解硒的生物学特性至关重要。在此,我们对最大的微生物序列数据集——马尾藻海环境基因组计划中的硒蛋白组进行了表征。
我们鉴定出310个硒蛋白基因,这些基因聚集成25个家族,其中包括属于15个家族的101个新硒蛋白基因。这些蛋白质大多被预测为含有催化性硒代半胱氨酸的氧化还原蛋白。通过分析真核生物和细菌的硒代半胱氨酸插入序列元件,检测到了一些以前认为仅限于真核生物的细菌硒蛋白,这表明真核生物和细菌的硒蛋白组部分重叠。马尾藻海微生物硒蛋白组富含硒蛋白,其组成与在完整测序基因组的组合集中观察到的不同,这表明这些基因组不能准确代表微生物硒蛋白组。与它们的半胱氨酸同源物广泛存在相比,大多数检测到的硒蛋白是零星出现的,这表明许多硒蛋白最近是从含半胱氨酸的同源物进化而来的。
本研究产生了迄今为止最大的硒蛋白数据集,使原核生物硒蛋白家族的数量增加了一倍,并为驱动硒代半胱氨酸进化的力量提供了见解。