• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于基因集分析的多变量方差分析测试。

Multivariate analysis of variance test for gene set analysis.

作者信息

Tsai Chen-An, Chen James J

机构信息

Graduate Institute of Biostatistics and Biostatistics Center, China Medical University, Taichung, Taiwan.

出版信息

Bioinformatics. 2009 Apr 1;25(7):897-903. doi: 10.1093/bioinformatics/btp098. Epub 2009 Mar 2.

DOI:10.1093/bioinformatics/btp098
PMID:19254923
Abstract

MOTIVATION

Gene class testing (GCT) or gene set analysis (GSA) is a statistical approach to determine whether some functionally predefined sets of genes express differently under different experimental conditions. Shortcomings of the Fisher's exact test for the overrepresentation analysis are illustrated by an example. Most alternative GSA methods are developed for data collected from two experimental conditions, and most is based on a univariate gene-by-gene test statistic or assume independence among genes in the gene set. A multivariate analysis of variance (MANOVA) approach is proposed for studies with two or more experimental conditions.

RESULTS

When the number of genes in the gene set is greater than the number of samples, the sample covariance matrix is singular and ill-condition. The use of standard multivariate methods can result in biases in the analysis. The proposed MANOVA test uses a shrinkage covariance matrix estimator for the sample covariance matrix. The MANOVA test and six other GSA published methods, principal component analysis, SAM-GS, analysis of covariance, Global, GSEA and MaxMean, are evaluated using simulation. The MANOVA test appears to perform the best in terms of control of type I error and power under the models considered in the simulation. Several publicly available microarray datasets under two and three experimental conditions are analyzed for illustrations of GSA. Most methods, except for GSEA and MaxMean, generally are comparable in terms of power of identification of significant gene sets.

AVAILABILITY

A free R-code to perform MANOVA test is available at http://mail.cmu.edu.tw/~catsai/research.htm.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

基因类测试(GCT)或基因集分析(GSA)是一种统计方法,用于确定某些功能上预定义的基因集在不同实验条件下是否有不同表达。通过一个例子说明了用于过度代表性分析的Fisher精确检验的缺点。大多数替代GSA方法是为从两个实验条件收集的数据开发的,并且大多数基于单变量逐个基因的检验统计量或假设基因集中基因之间的独立性。本文提出了一种用于两个或更多实验条件研究的多变量方差分析(MANOVA)方法。

结果

当基因集中的基因数量大于样本数量时,样本协方差矩阵是奇异且病态的。使用标准多变量方法可能会导致分析出现偏差。所提出的MANOVA检验使用样本协方差矩阵的收缩协方差矩阵估计器。使用模拟对MANOVA检验和其他六种已发表的GSA方法(主成分分析、SAM-GS、协方差分析、Global、GSEA和MaxMean)进行了评估。在模拟考虑的模型下,MANOVA检验在控制I型错误和功效方面似乎表现最佳。分析了两个和三个实验条件下的几个公开可用的微阵列数据集,以说明GSA。除GSEA和MaxMean外,大多数方法在识别显著基因集的功效方面通常具有可比性。

可用性

可在http://mail.cmu.edu.tw/~catsai/research.htm获得执行MANOVA检验的免费R代码。

补充信息

补充数据可在《生物信息学》在线获取。

相似文献

1
Multivariate analysis of variance test for gene set analysis.用于基因集分析的多变量方差分析测试。
Bioinformatics. 2009 Apr 1;25(7):897-903. doi: 10.1093/bioinformatics/btp098. Epub 2009 Mar 2.
2
Discovering gene expression patterns in time course microarray experiments by ANOVA-SCA.通过方差分析-稀疏成分分析在时间进程微阵列实验中发现基因表达模式。
Bioinformatics. 2007 Jul 15;23(14):1792-800. doi: 10.1093/bioinformatics/btm251. Epub 2007 May 22.
3
Significance analysis of groups of genes in expression profiling studies.表达谱研究中基因分组的显著性分析。
Bioinformatics. 2007 Aug 15;23(16):2104-12. doi: 10.1093/bioinformatics/btm310. Epub 2007 Jun 6.
4
Robustified MANOVA with applications in detecting differentially expressed genes from oligonucleotide arrays.稳健多变量方差分析及其在从寡核苷酸阵列中检测差异表达基因方面的应用
Bioinformatics. 2008 Apr 15;24(8):1056-62. doi: 10.1093/bioinformatics/btn053. Epub 2008 Mar 3.
5
SEGS: search for enriched gene sets in microarray data.SEGS:在微阵列数据中搜索富集的基因集。
J Biomed Inform. 2008 Aug;41(4):588-601. doi: 10.1016/j.jbi.2007.12.001. Epub 2007 Dec 15.
6
Extensions to gene set enrichment.基因集富集的扩展
Bioinformatics. 2007 Feb 1;23(3):306-13. doi: 10.1093/bioinformatics/btl599. Epub 2006 Nov 24.
7
Improved statistical tests for differential gene expression by shrinking variance components estimates.通过收缩方差分量估计改进差异基因表达的统计检验。
Biostatistics. 2005 Jan;6(1):59-75. doi: 10.1093/biostatistics/kxh018.
8
Genetic test bed for feature selection.用于特征选择的基因测试平台。
Bioinformatics. 2006 Apr 1;22(7):837-42. doi: 10.1093/bioinformatics/btl008. Epub 2006 Jan 20.
9
A hidden Markov model-based approach for identifying timing differences in gene expression under different experimental factors.一种基于隐马尔可夫模型的方法,用于识别不同实验因素下基因表达的时间差异。
Bioinformatics. 2007 Apr 1;23(7):842-9. doi: 10.1093/bioinformatics/btl667. Epub 2007 Jan 19.
10
Exploiting sample variability to enhance multivariate analysis of microarray data.利用样本变异性增强微阵列数据的多变量分析。
Bioinformatics. 2007 Oct 15;23(20):2733-40. doi: 10.1093/bioinformatics/btm441. Epub 2007 Sep 7.

引用本文的文献

1
Geographically weighted linear combination test for gene-set analysis of a continuous spatial phenotype as applied to intratumor heterogeneity.应用于肿瘤内异质性的连续空间表型基因集分析的地理加权线性组合检验
Front Cell Dev Biol. 2023 Mar 9;11:1065586. doi: 10.3389/fcell.2023.1065586. eCollection 2023.
2
Local senolysis in aged mice only partially replicates the benefits of systemic senolysis.衰老小鼠的局部衰老细胞清除仅部分复制了系统性衰老细胞清除的益处。
J Clin Invest. 2023 Apr 17;133(8):e162519. doi: 10.1172/JCI162519.
3
A comprehensive survey of the approaches for pathway analysis using multi-omics data integration.
多组学数据整合的通路分析方法的全面综述。
Brief Bioinform. 2022 Nov 19;23(6). doi: 10.1093/bib/bbac435.
4
Skeletal Effects of Inducible ERα Deletion in Osteocytes in Adult Mice.成体小鼠成骨细胞中诱导型 ERα 缺失对骨骼的影响。
J Bone Miner Res. 2022 Sep;37(9):1750-1760. doi: 10.1002/jbmr.4644. Epub 2022 Jul 22.
5
mitch: multi-contrast pathway enrichment for multi-omics and single-cell profiling data.米奇:多组学和单细胞分析数据的多对照通路富集分析。
BMC Genomics. 2020 Jun 29;21(1):447. doi: 10.1186/s12864-020-06856-9.
6
The Gastric Ganglion of : Preliminary Characterization of Gene- and Putative Neurochemical-Complexity, and the Effect of Digestive Tract Infection on Gene Expression.关于胃神经节:基因及假定神经化学复杂性的初步特征,以及消化道感染对基因表达的影响
Front Physiol. 2017 Dec 15;8:1001. doi: 10.3389/fphys.2017.01001. eCollection 2017.
7
A knowledge-based T2-statistic to perform pathway analysis for quantitative proteomic data.一种基于知识的T2统计量,用于对定量蛋白质组学数据进行通路分析。
PLoS Comput Biol. 2017 Jun 16;13(6):e1005601. doi: 10.1371/journal.pcbi.1005601. eCollection 2017 Jun.
8
Identification of Genes Discriminating Multiple Sclerosis Patients from Controls by Adapting a Pathway Analysis Method.通过改进通路分析方法鉴别区分多发性硬化症患者与对照的基因
PLoS One. 2016 Nov 15;11(11):e0165543. doi: 10.1371/journal.pone.0165543. eCollection 2016.
9
Monte Carlo simulation of OLS and linear mixed model inference of phenotypic effects on gene expression.基于普通最小二乘法(OLS)和线性混合模型的表型对基因表达影响推断的蒙特卡罗模拟
PeerJ. 2016 Oct 11;4:e2575. doi: 10.7717/peerj.2575. eCollection 2016.
10
Gene set analysis using sufficient dimension reduction.使用充分降维的基因集分析。
BMC Bioinformatics. 2016 Feb 6;17:74. doi: 10.1186/s12859-016-0928-6.