NEBION AG, Hohlstrasse 515, 8048 Zurich, Switzerland.
Philip Morris International R&D, Quai Jeanrenaud 5, 2003 Neuchatel, Switzerland.
BioData Min. 2014 Aug 31;7:18. doi: 10.1186/1756-0381-7-18. eCollection 2014.
Reference datasets are often used to compare, interpret or validate experimental data and analytical methods. In the field of gene expression, several reference datasets have been published. Typically, they consist of individual baseline or spike-in experiments carried out in a single laboratory and representing a particular set of conditions. Here, we describe a new type of standardized datasets representative for the spatial and temporal dimensions of gene expression. They result from integrating expression data from a large number of globally normalized and quality controlled public experiments. Expression data is aggregated by anatomical part or stage of development to yield a representative transcriptome for each category. For example, we created a genome-wide expression dataset representing the FDA tissue panel across 35 tissue types. The proposed datasets were created for human and several model organisms and are publicly available at http://www.expressiondata.org.
参考数据集通常用于比较、解释或验证实验数据和分析方法。在基因表达领域,已经发布了几个参考数据集。通常,它们由单个实验室进行的单个基线或 Spike-in 实验组成,代表特定的条件集。在这里,我们描述了一种新的标准化数据集,代表基因表达的空间和时间维度。它们是通过整合大量全球标准化和质量控制的公共实验的表达数据而产生的。通过解剖部分或发育阶段对表达数据进行聚合,以获得每个类别的代表性转录组。例如,我们创建了一个代表 FDA 组织面板的全基因组表达数据集,涵盖了 35 种组织类型。所提出的数据集是为人类和几种模式生物创建的,并可在 http://www.expressiondata.org 上公开获取。