Barakat A, Szick-Miranda K, Chang I F, Guyot R, Blanc G, Cooke R, Delseny M, Bailey-Serres J
Laboratoire Génome et Développement des Plantes, Unité Mixte de Recherche 5096 Centre National de la Recherche Scientifique, Université de Perpignan, 52 Avenue de Villeneuve, 66860 Perpignan cedex, France.
Plant Physiol. 2001 Oct;127(2):398-415.
Eukaryotic ribosomes are made of two components, four ribosomal RNAs, and approximately 80 ribosomal proteins (r-proteins). The exact number of r-proteins and r-protein genes in higher plants is not known. The strong conservation in eukaryotic r-protein primary sequence allowed us to use the well-characterized rat (Rattus norvegicus) r-protein set to identify orthologues on the five haploid chromosomes of Arabidopsis. By use of the numerous expressed sequence tag (EST) accessions and the complete genomic sequence of this species, we identified 249 genes (including some pseudogenes) corresponding to 80 (32 small subunit and 48 large subunit) cytoplasmic r-protein types. None of the r-protein genes are single copy and most are encoded by three or four expressed genes, indicative of the internal duplication of the Arabidopsis genome. The r-proteins are distributed throughout the genome. Inspection of genes in the vicinity of r-protein gene family members confirms extensive duplications of large chromosome fragments and sheds light on the evolutionary history of the Arabidopsis genome. Examination of large duplicated regions indicated that a significant fraction of the r-protein genes have been either lost from one of the duplicated fragments or inserted after the initial duplication event. Only 52 r-protein genes lack a matching EST accession, and 19 of these contain incomplete open reading frames, confirming that most genes are expressed. Assessment of cognate EST numbers suggests that r-protein gene family members are differentially expressed.
真核生物核糖体由两个组分、四种核糖体RNA以及大约80种核糖体蛋白(r蛋白)组成。高等植物中r蛋白和r蛋白基因的确切数量尚不清楚。真核生物r蛋白一级序列的高度保守性使我们能够利用已充分表征的大鼠(褐家鼠)r蛋白集来鉴定拟南芥五条单倍体染色体上的直系同源物。通过利用该物种众多的表达序列标签(EST)序列和完整的基因组序列,我们鉴定出了249个基因(包括一些假基因),它们对应于80种(32种小亚基和48种大亚基)细胞质r蛋白类型。没有一个r蛋白基因是单拷贝的,大多数由三个或四个表达基因编码,这表明拟南芥基因组存在内部重复。r蛋白分布于整个基因组。对r蛋白基因家族成员附近基因的检查证实了大染色体片段的广泛重复,并揭示了拟南芥基因组的进化历史。对大的重复区域的检查表明,相当一部分r蛋白基因要么在其中一个重复片段中丢失,要么在初始重复事件之后插入。只有52个r蛋白基因缺乏匹配的EST序列,其中19个含有不完整的开放阅读框,这证实了大多数基因是表达的。对同源EST数量的评估表明,r蛋白基因家族成员的表达存在差异。