Aach J, Rindone W, Church G M
Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115 USA.
Genome Res. 2000 Apr;10(4):431-45. doi: 10.1101/gr.10.4.431.
We report steps toward the systematic management, standardization, and analysis of functional genomics data. We developed the ExpressDB database for yeast RNA expression data and loaded it with approximately 17.5 million pieces of data reported by 11 studies with three different kinds of high-throughput RNA assays. A web-based tool supports queries across the data from these studies. We examined comparability of data by converting data from 9 studies (217 conditions) into mRNA relative abundance estimates (ERAs) and by clustering of conditions by ERAs. We report on generation of ERAs and condition clustering for non-microarray data (5 studies, 63 conditions) and describe initial attempts to generate microarray-based ERAs (4 studies, 154 conditions), which exhibit increased error, on our web site http://arep.med.harvard. edu/ExpressDB. We recommend standards for data reporting, suggest research into improving comparability of microarray data through quantifying and standardizing control condition RNA populations, and also suggest research into the calibration of different RNA assays. We introduce a model for a database that integrates different kinds of functional genomics data, Biomolecule Interaction, Growth and Expression Database (BIGED).
我们报告了在功能基因组学数据的系统管理、标准化和分析方面所采取的步骤。我们开发了用于酵母RNA表达数据的ExpressDB数据库,并将11项研究通过三种不同类型的高通量RNA检测方法报告的约1750万条数据加载到该数据库中。一个基于网络的工具支持对这些研究的数据进行查询。我们通过将9项研究(217种条件)的数据转换为mRNA相对丰度估计值(ERA)并按ERA对条件进行聚类,来检验数据的可比性。我们报告了非微阵列数据(5项研究,63种条件)的ERA生成和条件聚类情况,并在我们的网站http://arep.med.harvard.edu/ExpressDB上描述了生成基于微阵列的ERA的初步尝试(4项研究,154种条件),这些尝试显示出误差增加。我们推荐数据报告标准,建议通过对对照条件RNA群体进行量化和标准化来研究提高微阵列数据的可比性,还建议对不同RNA检测方法的校准进行研究。我们介绍了一个整合不同类型功能基因组学数据的数据库模型,即生物分子相互作用、生长和表达数据库(BIGED)。