Alter O, Brown P O, Botstein D
Departments of Genetics and Biochemistry, Stanford University, Stanford, CA 94305, USA.
Proc Natl Acad Sci U S A. 2000 Aug 29;97(18):10101-6. doi: 10.1073/pnas.97.18.10101.
We describe the use of singular value decomposition in transforming genome-wide expression data from genes x arrays space to reduced diagonalized "eigengenes" x "eigenarrays" space, where the eigengenes (or eigenarrays) are unique orthonormal superpositions of the genes (or arrays). Normalizing the data by filtering out the eigengenes (and eigenarrays) that are inferred to represent noise or experimental artifacts enables meaningful comparison of the expression of different genes across different arrays in different experiments. Sorting the data according to the eigengenes and eigenarrays gives a global picture of the dynamics of gene expression, in which individual genes and arrays appear to be classified into groups of similar regulation and function, or similar cellular state and biological phenotype, respectively. After normalization and sorting, the significant eigengenes and eigenarrays can be associated with observed genome-wide effects of regulators, or with measured samples, in which these regulators are overactive or underactive, respectively.
我们描述了奇异值分解在将全基因组表达数据从基因×阵列空间转换为简化的对角化“特征基因”ד特征阵列”空间中的应用,其中特征基因(或特征阵列)是基因(或阵列)的独特正交归一化叠加。通过滤除被推断为代表噪声或实验假象的特征基因(和特征阵列)对数据进行归一化,能够在不同实验中对不同阵列上不同基因的表达进行有意义的比较。根据特征基因和特征阵列对数据进行排序,可给出基因表达动态的全局图景,其中单个基因和阵列似乎分别被分类为具有相似调控和功能的组,或具有相似细胞状态和生物学表型的组。在归一化和排序之后,显著的特征基因和特征阵列可分别与观察到的调节因子的全基因组效应相关联,或与这些调节因子分别过度活跃或活性不足的测量样本相关联。