Zhou Yan, Tang Jiabin, Walker Michael G, Zhang Xiuqing, Wang Jun, Hu Songnian, Xu Huayong, Deng Yajun, Dong Jianhai, Ye Lin, Lin Li, Li Jun, Wang Xuegang, Xu Hao, Pan Yibin, Lin Wei, Tian Wei, Liu Jing, Wei Liping, Liu Siqi, Yang Huanming, Yu Jun, Wang Jian
Hangzhou Genomics Institute/Institute of Bioinformatics of Zhejiang University/Key Laboratory of Bioinformatics of Zhejiang Province, Hangzhou 310007, China.
Genomics Proteomics Bioinformatics. 2003 Feb;1(1):26-42. doi: 10.1016/s1672-0229(03)01005-2.
Expressed Sequence Tag (EST) analysis has pioneered genome-wide gene discovery and expression profiling. In order to establish a gene expression index in the rice cultivar indica, we sequenced and analyzed 86,136 ESTs from nine rice cDNA libraries from the super hybrid cultivar LYP9 and its parental cultivars. We assembled these ESTs into 13,232 contigs and leave 8,976 singletons. Overall, 7,497 sequences were found similar to existing sequences in GenBank and 14,711 are novel. These sequences are classified by molecular function, biological process and pathways according to the Gene Ontology. We compared our sequenced ESTs with the publicly available 95,000 ESTs from japonica, and found little sequence variation, despite the large difference between genome sequences. We then assembled the combined 173,000 rice ESTs for further analysis. Using the pooled ESTs, we compared gene expression in metabolism pathway between rice and Arabidopsis according to KEGG. We further profiled gene expression patterns in different tissues, developmental stages, and in a conditional sterile mutant, after checking the libraries are comparable by means of sequence coverage. We also identified some possible library specific genes and a number of enzymes and transcription factors that contribute to rice development.
表达序列标签(EST)分析开创了全基因组范围的基因发现和表达谱分析。为了建立籼稻品种的基因表达指数,我们对超级杂交品种LYP9及其亲本品种的9个水稻cDNA文库中的86,136个EST进行了测序和分析。我们将这些EST组装成13,232个重叠群,并留下8,976个单拷贝序列。总体而言,发现7,497个序列与GenBank中的现有序列相似,14,711个是新序列。这些序列根据基因本体论按分子功能、生物学过程和途径进行分类。我们将测序的EST与来自粳稻的公开可用的95,000个EST进行比较,尽管基因组序列存在很大差异,但发现序列变异很少。然后我们组装了总共173,000个水稻EST用于进一步分析。使用汇集的EST,我们根据KEGG比较了水稻和拟南芥在代谢途径中的基因表达。在通过序列覆盖度检查文库具有可比性之后,我们进一步分析了不同组织、发育阶段以及一个条件不育突变体中的基因表达模式。我们还鉴定了一些可能的文库特异性基因以及许多对水稻发育有贡献的酶和转录因子。