用于通路分析的分组检验可提高不同微阵列数据集的可比性。

Group testing for pathway analysis improves comparability of different microarray datasets.

作者信息

Manoli Theodora, Gretz Norbert, Gröne Hermann-Josef, Kenzelmann Marc, Eils Roland, Brors Benedikt

机构信息

Theoretical Bioinformatics, German Cancer Reseach Center, 69120 Heidelberg, Germany.

出版信息

Bioinformatics. 2006 Oct 15;22(20):2500-6. doi: 10.1093/bioinformatics/btl424. Epub 2006 Aug 7.

DOI:10.1093/bioinformatics/btl424

PMID:16895928

Abstract

MOTIVATION

The wide use of DNA microarrays for the investigation of the cell transcriptome triggered the invention of numerous methods for the processing of microarray data and lead to a growing number of microarray studies that examine the same biological conditions. However, comparisons made on the level of gene lists obtained by different statistical methods or from different datasets hardly converge. We aimed at examining such discrepancies on the level of apparently affected biologically related groups of genes, e.g. metabolic or signalling pathways. This can be achieved by group testing procedures, e.g. over-representation analysis, functional class scoring (FCS), or global tests.

RESULTS

Three public prostate cancer datasets obtained with the same microarray platform (HGU95A/HGU95Av2) were analyzed. Each dataset was subjected to normalization by either variance stabilizing normalization (vsn) or mixed model normalization (MMN). Then, statistical analysis of microarrays was applied to the vsn-normalized data and mixed model analysis to the data normalized by MMN. For multiple testing adjustment the false discovery rate was calculated and the threshold was set to 0.05. Gene lists from the same method applied to different datasets showed overlaps between 42 and 52%, while lists from different methods applied to the same dataset had between 63 and 85% of genes in common. A number of six gene lists obtained by the two statistical methods applied to the three datasets was then subjected to group testing by Fisher's exact test. Group testing by GSEA and global test was applied to the three datasets, as well. Fisher's exact test followed by global test showed more consistent results with respect to the concordance between analyses on gene lists obtained by different methods and different datasets than the GSEA. However, all group testing methods identified pathways that had already been described to be involved in the pathogenesis of prostate cancer. Moreover, pathways recurrently identified in these analyses are more likely to be reliable than those from a single analysis on a single dataset.

摘要

动机

DNA微阵列在细胞转录组研究中的广泛应用催生了众多处理微阵列数据的方法，导致越来越多的微阵列研究针对相同的生物学条件进行。然而，基于不同统计方法或不同数据集获得的基因列表进行的比较几乎无法达成一致。我们旨在在明显受影响的生物学相关基因群组层面，例如代谢或信号通路，研究此类差异。这可以通过群组检验程序来实现，例如过度表达分析、功能类别评分（FCS）或全局检验。

结果

分析了通过相同微阵列平台（HGU95A/HGU95Av2）获得的三个前列腺癌公共数据集。每个数据集分别采用方差稳定归一化（vsn）或混合模型归一化（MMN）进行归一化处理。然后，对经vsn归一化的数据进行微阵列统计分析，对经MMN归一化的数据进行混合模型分析。对于多重检验校正，计算错误发现率并将阈值设定为0.05。应用于不同数据集的相同方法得到的基因列表之间的重叠率在42%至52%之间，而应用于相同数据集的不同方法得到的列表之间有63%至85%的基因相同。然后，对应用于三个数据集的两种统计方法得到的六个基因列表进行Fisher精确检验的群组检验。GSEA和全局检验的群组检验也应用于这三个数据集。与GSEA相比，Fisher精确检验后接全局检验在不同方法和不同数据集获得的基因列表分析之间的一致性方面显示出更一致的结果。然而，所有群组检验方法都识别出了已被描述为参与前列腺癌发病机制的通路。此外，在这些分析中反复识别出的通路比单个数据集的单一分析所识别的通路更可能是可靠的。

相似文献

Group testing for pathway analysis improves comparability of different microarray datasets.用于通路分析的分组检验可提高不同微阵列数据集的可比性。

Bioinformatics. 2006 Oct 15;22(20):2500-6. doi: 10.1093/bioinformatics/btl424. Epub 2006 Aug 7.

Algebraic stability indicators for ranked lists in molecular profiling.分子谱分析中排序列表的代数稳定性指标

Bioinformatics. 2008 Jan 15;24(2):258-64. doi: 10.1093/bioinformatics/btm550. Epub 2007 Nov 16.

A rapid method for microarray cross platform comparisons using gene expression signatures.一种利用基因表达特征进行微阵列跨平台比较的快速方法。

Mol Cell Probes. 2007 Feb;21(1):35-46. doi: 10.1016/j.mcp.2006.07.004. Epub 2006 Aug 10.

A comparison of meta-analysis methods for detecting differentially expressed genes in microarray experiments.微阵列实验中检测差异表达基因的荟萃分析方法比较。

Bioinformatics. 2008 Feb 1;24(3):374-82. doi: 10.1093/bioinformatics/btm620. Epub 2008 Jan 18.

Cross-generation and cross-laboratory predictions of Affymetrix microarrays by rank-based methods.基于秩的方法对Affymetrix微阵列进行跨代和跨实验室预测。

J Biomed Inform. 2008 Aug;41(4):570-9. doi: 10.1016/j.jbi.2007.11.005. Epub 2007 Dec 4.

Gaining confidence in biological interpretation of the microarray data: the functional consistence of the significant GO categories.增强对微阵列数据生物学解读的信心：显著GO类别的功能一致性。

Bioinformatics. 2008 Jan 15;24(2):265-71. doi: 10.1093/bioinformatics/btm558. Epub 2007 Nov 15.

Ensemble dependence model for classification and prediction of cancer and normal gene expression data.用于癌症和正常基因表达数据分类与预测的集成依赖模型。

Bioinformatics. 2005 Jul 15;21(14):3114-21. doi: 10.1093/bioinformatics/bti483. Epub 2005 May 6.

Impact of DNA microarray data transformation on gene expression analysis - comparison of two normalization methods.DNA微阵列数据转换对基因表达分析的影响——两种标准化方法的比较

Acta Biochim Pol. 2011;58(4):573-80. Epub 2011 Dec 20.

Cross platform microarray analysis for robust identification of differentially expressed genes.用于可靠鉴定差异表达基因的跨平台微阵列分析。

BMC Bioinformatics. 2007 Mar 8;8 Suppl 1(Suppl 1):S5. doi: 10.1186/1471-2105-8-S1-S5.

Independent component analysis-based penalized discriminant method for tumor classification using gene expression data.基于独立成分分析的惩罚判别方法用于利用基因表达数据进行肿瘤分类

Bioinformatics. 2006 Aug 1;22(15):1855-62. doi: 10.1093/bioinformatics/btl190. Epub 2006 May 18.

引用本文的文献

Gene expression signatures of response to fluoxetine treatment: systematic review and meta-analyses.氟西汀治疗反应的基因表达特征：系统评价与荟萃分析。

Mol Psychiatry. 2025 Jul 17. doi: 10.1038/s41380-025-03118-6.

Ant colony optimization for the identification of dysregulated gene subnetworks from expression data.基于蚁群算法的表达数据中失调基因子网络识别

BMC Bioinformatics. 2024 Aug 1;25(1):254. doi: 10.1186/s12859-024-05871-x.

A non-negative spike-and-slab lasso generalized linear stacking prediction modeling method for high-dimensional omics data.一种用于高维组学数据的非负尖峰-板条套索广义线性堆叠预测建模方法。

BMC Bioinformatics. 2024 Mar 20;25(1):119. doi: 10.1186/s12859-024-05741-6.

High Expression Levels of CDK1 and CDC20 in Patients With Lung Squamous Cell Carcinoma are Associated With Worse Prognosis.肺鳞状细胞癌患者中CDK1和CDC20的高表达水平与较差的预后相关。

Front Mol Biosci. 2021 Jul 7;8:653805. doi: 10.3389/fmolb.2021.653805. eCollection 2021.

Electroretinography and Gene Expression Measures Implicate Phototransduction and Metabolic Shifts in Chick Myopia and Hyperopia Models.视网膜电图和基因表达测量表明光转导和代谢变化与鸡近视和远视模型有关。

Life (Basel). 2021 May 29;11(6):501. doi: 10.3390/life11060501.

High Expression of , , and Predicts Worse Prognosis among Nonsmoking Patients with Lung Adenocarcinoma through Bioinformatics Analysis.通过生物信息学分析，、、在不吸烟肺腺癌患者中的高表达预示着更差的预后。

Biomed Res Int. 2020 Oct 20;2020:2071593. doi: 10.1155/2020/2071593. eCollection 2020.

Enhancing reproducibility of gene expression analysis with known protein functional relationships: The concept of well-associated protein.利用已知蛋白质功能关系提高基因表达分析的可重复性：良好关联蛋白的概念。

PLoS Comput Biol. 2020 Feb 14;16(2):e1007684. doi: 10.1371/journal.pcbi.1007684. eCollection 2020 Feb.

GeneSurrounder: network-based identification of disease genes in expression data.GeneSurrounder：基于网络的表达数据中疾病基因的识别。

BMC Bioinformatics. 2019 May 6;20(1):229. doi: 10.1186/s12859-019-2829-y.

Short term optical defocus perturbs normal developmental shifts in retina/RPE protein abundance.短期光学离焦会扰乱视网膜/视网膜色素上皮蛋白丰度的正常发育变化。

BMC Dev Biol. 2018 Aug 29;18(1):18. doi: 10.1186/s12861-018-0177-1.

Impact of congenital cytomegalovirus infection on transcriptomes from archived dried blood spots in relation to long-term clinical outcome.先天性巨细胞病毒感染对长期临床结局相关存档血斑转录组的影响。

PLoS One. 2018 Jul 19;13(7):e0200652. doi: 10.1371/journal.pone.0200652. eCollection 2018.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

用于通路分析的分组检验可提高不同微阵列数据集的可比性。

Group testing for pathway analysis improves comparability of different microarray datasets.

作者信息

机构信息

出版信息

MOTIVATION

RESULTS

动机

结果

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献