Suppr超能文献

用于通路分析的分组检验可提高不同微阵列数据集的可比性。

Group testing for pathway analysis improves comparability of different microarray datasets.

作者信息

Manoli Theodora, Gretz Norbert, Gröne Hermann-Josef, Kenzelmann Marc, Eils Roland, Brors Benedikt

机构信息

Theoretical Bioinformatics, German Cancer Reseach Center, 69120 Heidelberg, Germany.

出版信息

Bioinformatics. 2006 Oct 15;22(20):2500-6. doi: 10.1093/bioinformatics/btl424. Epub 2006 Aug 7.

Abstract

MOTIVATION

The wide use of DNA microarrays for the investigation of the cell transcriptome triggered the invention of numerous methods for the processing of microarray data and lead to a growing number of microarray studies that examine the same biological conditions. However, comparisons made on the level of gene lists obtained by different statistical methods or from different datasets hardly converge. We aimed at examining such discrepancies on the level of apparently affected biologically related groups of genes, e.g. metabolic or signalling pathways. This can be achieved by group testing procedures, e.g. over-representation analysis, functional class scoring (FCS), or global tests.

RESULTS

Three public prostate cancer datasets obtained with the same microarray platform (HGU95A/HGU95Av2) were analyzed. Each dataset was subjected to normalization by either variance stabilizing normalization (vsn) or mixed model normalization (MMN). Then, statistical analysis of microarrays was applied to the vsn-normalized data and mixed model analysis to the data normalized by MMN. For multiple testing adjustment the false discovery rate was calculated and the threshold was set to 0.05. Gene lists from the same method applied to different datasets showed overlaps between 42 and 52%, while lists from different methods applied to the same dataset had between 63 and 85% of genes in common. A number of six gene lists obtained by the two statistical methods applied to the three datasets was then subjected to group testing by Fisher's exact test. Group testing by GSEA and global test was applied to the three datasets, as well. Fisher's exact test followed by global test showed more consistent results with respect to the concordance between analyses on gene lists obtained by different methods and different datasets than the GSEA. However, all group testing methods identified pathways that had already been described to be involved in the pathogenesis of prostate cancer. Moreover, pathways recurrently identified in these analyses are more likely to be reliable than those from a single analysis on a single dataset.

摘要

动机

DNA微阵列在细胞转录组研究中的广泛应用催生了众多处理微阵列数据的方法,导致越来越多的微阵列研究针对相同的生物学条件进行。然而,基于不同统计方法或不同数据集获得的基因列表进行的比较几乎无法达成一致。我们旨在在明显受影响的生物学相关基因群组层面,例如代谢或信号通路,研究此类差异。这可以通过群组检验程序来实现,例如过度表达分析、功能类别评分(FCS)或全局检验。

结果

分析了通过相同微阵列平台(HGU95A/HGU95Av2)获得的三个前列腺癌公共数据集。每个数据集分别采用方差稳定归一化(vsn)或混合模型归一化(MMN)进行归一化处理。然后,对经vsn归一化的数据进行微阵列统计分析,对经MMN归一化的数据进行混合模型分析。对于多重检验校正,计算错误发现率并将阈值设定为0.05。应用于不同数据集的相同方法得到的基因列表之间的重叠率在42%至52%之间,而应用于相同数据集的不同方法得到的列表之间有63%至85%的基因相同。然后,对应用于三个数据集的两种统计方法得到的六个基因列表进行Fisher精确检验的群组检验。GSEA和全局检验的群组检验也应用于这三个数据集。与GSEA相比,Fisher精确检验后接全局检验在不同方法和不同数据集获得的基因列表分析之间的一致性方面显示出更一致的结果。然而,所有群组检验方法都识别出了已被描述为参与前列腺癌发病机制的通路。此外,在这些分析中反复识别出的通路比单个数据集的单一分析所识别的通路更可能是可靠的。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验