Instituto de Física, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil.
Nucleic Acids Res. 2011 Apr;39(8):3005-16. doi: 10.1093/nar/gkq1269. Epub 2010 Dec 15.
Analysis of genome-wide expression data poses a challenge to extract relevant information. The usual approaches compare cellular expression levels relative to a pre-established control and genes are clustered based on the correlation of their expression levels. This implies that cluster definitions are dependent on the cellular metabolic state, eventually varying from one experiment to another. We present here a computational method that order genes on a line and clusters genes by the probability that their products interact. Protein-protein association information can be obtained from large data bases as STRING. The genome organization obtained this way is independent from specific experiments, and defines functional modules that are associated with gene ontology terms. The starting point is a gene list and a matrix specifying interactions. Considering the Saccharomyces cerevisiae genome, we projected on the ordering gene expression data, producing plots of transcription levels for two different experiments, whose data are available at Gene Expression Omnibus database. These plots discriminate metabolic cellular states, point to additional conclusions, and may be regarded as the first versions of 'transcriptograms'. This method is useful for extracting information from cell stimuli/responses experiments, and may be applied with diagnostic purposes to different organisms.
对全基因组表达数据进行分析,提出了一个挑战,以提取相关信息。通常的方法是比较细胞表达水平相对于预先确定的控制和基因聚类根据他们的表达水平的相关性。这意味着,群集的定义取决于细胞的代谢状态,最终从一个实验到另一个变化。我们在这里提出了一种计算方法,订购线基因和基因簇的概率,他们的产品进行交互。蛋白质 - 蛋白质相互作用的信息可以从大型数据库,如 STRING。基因组组织以这种方式获得是独立于特定的实验,并定义与基因本体论术语相关的功能模块。出发点是一个基因列表和一个指定相互作用的矩阵。考虑到酿酒酵母基因组,我们在订购基因表达数据上进行了投影,产生了两个不同实验的转录水平的图,其数据可在基因表达综合数据库中获得。这些图区分代谢细胞状态,指向其他结论,并可能被视为“转录组”的第一版。该方法可用于从细胞刺激/反应实验中提取信息,并可应用于不同生物体的诊断目的。