Fizames Cécile, Muños Stéphane, Cazettes Céline, Nacry Philippe, Boucherez Jossia, Gaymard Frédéric, Piquemal David, Delorme Valérie, Commes Thérèse, Doumas Patrick, Cooke Richard, Marti Jacques, Sentenac Hervé, Gojon Alain
Biochimie et Physiologie Moléculaire des Plantes, Unité Mixte de Recherche 5004, Agro-M/Centre National de la Recherche Scientifique/Institut National de la Recherche Agronomique/UM2, Place Viala, 34060 Montpellier 1, France.
Plant Physiol. 2004 Jan;134(1):67-80. doi: 10.1104/pp.103.030536.
Large-scale identification of genes expressed in roots of the model plant Arabidopsis was performed by serial analysis of gene expression (SAGE), on a total of 144,083 sequenced tags, representing at least 15,964 different mRNAs. For tag to gene assignment, we developed a computational approach based on 26,620 genes annotated from the complete sequence of the genome. The procedure selected warrants the identification of the genes corresponding to the majority of the tags found experimentally, with a high level of reliability, and provides a reference database for SAGE studies in Arabidopsis. This new resource allowed us to characterize the expression of more than 3,000 genes, for which there is no expressed sequence tag (EST) or cDNA in the databases. Moreover, 85% of the tags were specific for one gene. To illustrate this advantage of SAGE for functional genomics, we show that our data allow an unambiguous analysis of most of the individual genes belonging to 12 different ion transporter multigene families. These results indicate that, compared with EST-based tag to gene assignment, the use of the annotated genome sequence greatly improves gene identification in SAGE studies. However, more than 6,000 different tags remained with no gene match, suggesting that a significant proportion of transcripts present in the roots originate from yet unknown or wrongly annotated genes. The root transcriptome characterized in this study markedly differs from those obtained in other organs, and provides a unique resource for investigating the functional specificities of the root system. As an example of the use of SAGE for transcript profiling in Arabidopsis, we report here the identification of 270 genes differentially expressed between roots of plants grown either with NO3- or NH4NO3 as N source.
通过基因表达序列分析(SAGE)对模式植物拟南芥根中表达的基因进行了大规模鉴定,共分析了144,083个测序标签,代表至少15,964种不同的mRNA。为了将标签与基因进行匹配,我们基于从基因组完整序列注释的26,620个基因开发了一种计算方法。所选择的程序能够以高可靠性鉴定与实验中发现的大多数标签相对应的基因,并为拟南芥的SAGE研究提供了一个参考数据库。这个新资源使我们能够表征3000多个基因的表达情况,这些基因在数据库中没有表达序列标签(EST)或cDNA。此外,85%的标签只对应一个基因。为了说明SAGE在功能基因组学方面的这一优势,我们表明我们的数据允许对属于12个不同离子转运蛋白多基因家族的大多数单个基因进行明确分析。这些结果表明,与基于EST的标签与基因匹配相比,使用注释的基因组序列在SAGE研究中大大提高了基因鉴定的准确性。然而,仍有6000多个不同的标签没有基因匹配,这表明根中存在的相当一部分转录本来自未知或注释错误的基因。本研究中表征的根转录组与在其他器官中获得的转录组明显不同,并为研究根系的功能特异性提供了独特的资源。作为SAGE用于拟南芥转录谱分析的一个例子,我们在此报告了在以NO3-或NH4NO3作为氮源生长的植物根之间差异表达的270个基因的鉴定。