School of Agriculture, Meiji University, Kawasaki, 214-8571 Japan.
Plant Cell Physiol. 2011 Feb;52(2):220-9. doi: 10.1093/pcp/pcq195. Epub 2010 Dec 23.
Similarity of gene expression profiles provides important clues for understanding the biological functions of genes, biological processes and metabolic pathways related to genes. A gene expression network (GEN) is an ideal choice to grasp such expression profile similarities among genes simultaneously. For GEN construction, the Pearson correlation coefficient (PCC) has been widely used as an index to evaluate the similarities of expression profiles for gene pairs. However, calculation of PCCs for all gene pairs requires large amounts of both time and computer resources. Based on correspondence analysis, we developed a new method for GEN construction, which takes minimal time even for large-scale expression data with general computational circumstances. Moreover, our method requires no prior parameters to remove sample redundancies in the data set. Using the new method, we constructed rice GENs from large-scale microarray data stored in a public database. We then collected and integrated various principal rice omics annotations in public and distinct databases. The integrated information contains annotations of genome, transcriptome and metabolic pathways. We thus developed the integrated database OryzaExpress for browsing GENs with an interactive and graphical viewer and principal omics annotations (http://riceball.lab.nig.ac.jp/oryzaexpress/). With integration of Arabidopsis GEN data from ATTED-II, OryzaExpress also allows us to compare GENs between rice and Arabidopsis. Thus, OryzaExpress is a comprehensive rice database that exploits powerful omics approaches from all perspectives in plant science and leads to systems biology.
基因表达谱的相似性为理解基因的生物学功能、相关的生物过程和代谢途径提供了重要线索。基因表达网络(GEN)是同时掌握这些基因表达谱相似性的理想选择。对于 GEN 的构建,皮尔逊相关系数(PCC)已被广泛用作评估基因对表达谱相似性的指标。然而,计算所有基因对的 PCCs 需要大量的时间和计算机资源。基于对应分析,我们开发了一种新的 GEN 构建方法,即使在具有一般计算环境的大规模表达数据中,也能以最小的时间进行计算。此外,我们的方法不需要任何先验参数来去除数据集中样本的冗余。使用新方法,我们从公共数据库中存储的大规模微阵列数据中构建了水稻 GEN。然后,我们收集并整合了来自公共和不同数据库的各种主要水稻组学注释。综合信息包含基因组、转录组和代谢途径的注释。因此,我们开发了集成数据库 OryzaExpress,用于使用交互式和图形化查看器浏览 GENs 以及主要组学注释(http://riceball.lab.nig.ac.jp/oryzaexpress/)。通过整合来自 ATTED-II 的拟南芥 GEN 数据,OryzaExpress 还允许我们比较水稻和拟南芥之间的 GEN。因此,OryzaExpress 是一个全面的水稻数据库,利用植物科学各个角度的强大组学方法,实现系统生物学。