Antonov Alexey V, Tetko Igor V, Kosykh Denis, Surmeli Dmitrij, Mewes Hans-Werner
GSF National Research Center for Environment and Health, Institute for Bioinformatics, Ingolstädter Landstrasse 1, D-85764 Neuherberg, Germany.
Comput Biol Chem. 2005 Aug;29(4):288-93. doi: 10.1016/j.compbiolchem.2005.06.004.
Most studies concerning expression data analyses usually exploit information on the variability of gene intensity across samples. This information is sensitive to initial data processing, which affects the final conclusions. However expression data contains scale-free information, which is directly comparable between different samples. We propose to use the pairwise ratio of gene expression values rather than their absolute intensities for a classification of expression data. This information is stable to data processing and thus more attractive for classification analyses. In proposed schema of data analyses only information on relative gene expression levels in each sample is exploited. Testing on publicly available datasets leads to superior classification results.
大多数关于表达数据分析的研究通常利用样本间基因强度变异性的信息。该信息对初始数据处理敏感,这会影响最终结论。然而,表达数据包含无标度信息,不同样本间可直接比较。我们建议使用基因表达值的成对比率而非其绝对强度来对表达数据进行分类。该信息对数据处理稳定,因此对分类分析更具吸引力。在所提出的数据分析模式中,仅利用每个样本中相对基因表达水平的信息。在公开可用数据集上进行测试可得到更优的分类结果。