Suppr超能文献

使用线性模型和诊断方法的基因集富集分析。

Gene set enrichment analysis using linear models and diagnostics.

作者信息

Oron Assaf P, Jiang Zhen, Gentleman Robert

机构信息

Fred Hutchinson Cancer Research Center, 1100 Fairview Avenue North, Seattle, WA 98109-1024, USA.

出版信息

Bioinformatics. 2008 Nov 15;24(22):2586-91. doi: 10.1093/bioinformatics/btn465. Epub 2008 Sep 11.

Abstract

MOTIVATION

Gene-set enrichment analysis (GSEA) can be greatly enhanced by linear model (regression) diagnostic techniques. Diagnostics can be used to identify outlying or influential samples, and also to evaluate model fit and explore model expansion.

RESULTS

We demonstrate this methodology on an adult acute lymphoblastic leukemia (ALL) dataset, using GSEA based on chromosome-band mapping of genes. Individual residuals, grouped or aggregated by chromosomal loci, indicate problematic samples and potential data-entry errors, and help identify hyperdiploidy as a factor playing a key role in expression for this dataset. Subsequent analysis pinpoints suspected DNA copy number abnormalities of specific samples and chromosomes (most prevalent are chromosomes X, 21 and 14), and also reveals significant expression differences between the hyperdiploid and diploid groups on other chromosomes (most prominently 19, 22, 3 and 13)--differences which are apparently not associated with copy number.

AVAILABILITY

Software for the statistical tools demonstrated in this article is available as Bioconductor package GSEAlm.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

基因集富集分析(GSEA)可通过线性模型(回归)诊断技术得到极大增强。诊断可用于识别异常或有影响力的样本,还可用于评估模型拟合情况并探索模型扩展。

结果

我们在一个成人急性淋巴细胞白血病(ALL)数据集上展示了这种方法,使用基于基因染色体带图谱的GSEA。按染色体位点分组或汇总的个体残差表明存在问题的样本和潜在的数据录入错误,并有助于确定超二倍体是该数据集中在表达方面起关键作用的一个因素。后续分析确定了特定样本和染色体(最常见的是X、21和14号染色体)疑似的DNA拷贝数异常,还揭示了超二倍体组和二倍体组在其他染色体(最显著的是19、22、3和13号染色体)上存在显著的表达差异——这些差异显然与拷贝数无关。

可用性

本文中展示的统计工具软件可作为生物导体包GSEAlm获取。

补充信息

补充数据可在《生物信息学》在线获取。

相似文献

1
Gene set enrichment analysis using linear models and diagnostics.使用线性模型和诊断方法的基因集富集分析。
Bioinformatics. 2008 Nov 15;24(22):2586-91. doi: 10.1093/bioinformatics/btn465. Epub 2008 Sep 11.
5
Chromosome banding studies in acute leukaemia at diagnosis.急性白血病诊断时的染色体显带研究。
Scand J Haematol. 1975 Nov;15(4):312-20. doi: 10.1111/j.1600-0609.1975.tb01086.x.

引用本文的文献

本文引用的文献

2
GlobalANCOVA: exploration and assessment of gene group effects.全局协方差分析:基因组效应的探索与评估
Bioinformatics. 2008 Jan 1;24(1):78-85. doi: 10.1093/bioinformatics/btm531. Epub 2007 Nov 17.
3
Extensions to gene set enrichment.基因集富集的扩展
Bioinformatics. 2007 Feb 1;23(3):306-13. doi: 10.1093/bioinformatics/btl599. Epub 2006 Nov 24.
7
Discovering statistically significant pathways in expression profiling studies.在基因表达谱研究中发现具有统计学意义的通路。
Proc Natl Acad Sci U S A. 2005 Sep 20;102(38):13544-9. doi: 10.1073/pnas.0506577102. Epub 2005 Sep 8.
8
PAGE: parametric analysis of gene set enrichment.PAGE:基因集富集的参数分析
BMC Bioinformatics. 2005 Jun 8;6:144. doi: 10.1186/1471-2105-6-144.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验