Fedorova Maria, van de Mortel Judith, Matsumoto Peter A, Cho Jennifer, Town Christopher D, VandenBosch Kathryn A, Gantt J Stephen, Vance Carroll P
Department of Agronomy and Plant Genetics, 1991 Upper Bedford Circle, University of Minnesota, St. Paul, MN 55108, USA.
Plant Physiol. 2002 Oct;130(2):519-37. doi: 10.1104/pp.006833.
The Medicago truncatula expressed sequence tag (EST) database (Gene Index) contains over 140,000 sequences from 30 cDNA libraries. This resource offers the possibility of identifying previously uncharacterized genes and assessing the frequency and tissue specificity of their expression in silico. Because M. truncatula forms symbiotic root nodules, unlike Arabidopsis, this is a particularly important approach in investigating genes specific to nodule development and function in legumes. Our analyses have revealed 340 putative gene products, or tentative consensus sequences (TCs), expressed solely in root nodules. These TCs were represented by two to 379 ESTs. Of these TCs, 3% appear to encode novel proteins, 57% encode proteins with a weak similarity to the GenBank accessions, and 40% encode proteins with strong similarity to the known proteins. Nodule-specific TCs were grouped into nine categories based on the predicted function of their protein products. Besides previously characterized nodulins, other examples of highly abundant nodule-specific transcripts include plantacyanin, agglutinin, embryo-specific protein, and purine permease. Six nodule-specific TCs encode calmodulin-like proteins that possess a unique cleavable transit sequence potentially targeting the protein into the peribacteroid space. Surprisingly, 114 nodule-specific TCs encode small Cys cluster proteins with a cleavable transit peptide. To determine the validity of the in silico analysis, expression of 91 putative nodule-specific TCs was analyzed by macroarray and RNA-blot hybridizations. Nodule-enhanced expression was confirmed experimentally for the TCs composed of five or more ESTs, whereas the results for those TCs containing fewer ESTs were variable.
蒺藜苜蓿表达序列标签(EST)数据库(基因索引)包含来自30个cDNA文库的超过140,000个序列。该资源提供了鉴定先前未表征基因并在计算机上评估其表达频率和组织特异性的可能性。由于蒺藜苜蓿与拟南芥不同,会形成共生根瘤,因此这是研究豆科植物根瘤发育和功能特异性基因的一种特别重要的方法。我们的分析揭示了仅在根瘤中表达的340个推定基因产物或暂定共有序列(TCs)。这些TCs由2至379个EST代表。在这些TCs中,3%似乎编码新蛋白质,57%编码与GenBank登录号有弱相似性的蛋白质,40%编码与已知蛋白质有强相似性的蛋白质。根据其蛋白质产物的预测功能,将根瘤特异性TCs分为九类。除了先前已表征的根瘤蛋白外,其他高度丰富的根瘤特异性转录本的例子包括植物蓝蛋白、凝集素、胚胎特异性蛋白和嘌呤通透酶。六个根瘤特异性TCs编码类钙调蛋白,其具有独特的可切割转运序列,可能将蛋白质靶向类菌体周隙。令人惊讶的是,114个根瘤特异性TCs编码具有可切割转运肽的小Cys簇蛋白。为了确定计算机分析的有效性,通过宏阵列和RNA印迹杂交分析了91个推定的根瘤特异性TCs的表达。对于由五个或更多EST组成的TCs,通过实验证实了根瘤增强表达,而对于那些包含较少EST的TCs,结果则各不相同。