Suppr超能文献

稳健多变量方差分析及其在从寡核苷酸阵列中检测差异表达基因方面的应用

Robustified MANOVA with applications in detecting differentially expressed genes from oligonucleotide arrays.

作者信息

Xu Jin, Cui Xinping

机构信息

Department of Statistics, East China Normal University, Shanghai 200241, China.

出版信息

Bioinformatics. 2008 Apr 15;24(8):1056-62. doi: 10.1093/bioinformatics/btn053. Epub 2008 Mar 3.

Abstract

MOTIVATION

Oligonucleotide arrays such as Affymetrix GeneChips use multiple probes, or a probe set, to measure the abundance of mRNA of every gene of interest. Some analysis methods attempt to summarize the multiple observations into one single score before conducting further analysis such as detecting differentially expressed genes (DEG), clustering and classification. However, there is a risk of losing a significant amount of information and consequently reaching inaccurate or even incorrect conclusions during this data reduction.

RESULTS

We developed a novel statistical method called robustified multivariate analysis of variance (MANOVA) based on the traditional MANOVA model and permutation test to detect DEG for both one-way and two-way cases. It can be extended to detect some special patterns of gene expression through profile analysis across k (>or=2) populations. The method utilizes probe-level data and requires no assumptions about the distribution of the dataset. We also propose a method of estimating the null distribution using quantile normalization in contrast to the 'pooling' method (Section 3.1). Monte Carlo simulation and real data analysis are conducted to demonstrate the performance of the proposed method comparing with the 'pooling' method and the usual Analysis of Variance (ANOVA) test based on the summarized scores. It is found that the new method successfully detects DEG under desired false discovery rate and is more powerful than the competing method especially when the number of groups is small.

AVAILABILITY

The package of robustified MANOVA can be downloaded from http://faculty.ucr.edu/~xpcui/software

摘要

动机

诸如Affymetrix基因芯片之类的寡核苷酸阵列使用多个探针或一个探针集来测量每个感兴趣基因的mRNA丰度。一些分析方法试图在进行进一步分析(如检测差异表达基因(DEG)、聚类和分类)之前,将多个观测值汇总为一个单一分数。然而,在这种数据简化过程中,存在丢失大量信息的风险,从而在得出结论时可能不准确甚至错误。

结果

我们基于传统的多变量方差分析(MANOVA)模型和置换检验,开发了一种名为稳健多变量方差分析(MANOVA)的新型统计方法,用于检测单向和双向情况下的DEG。它可以通过跨k(≥2)个群体的轮廓分析扩展到检测基因表达的一些特殊模式。该方法利用探针水平的数据,并且不需要对数据集的分布做任何假设。与“合并”方法(3.1节)相比,我们还提出了一种使用分位数归一化估计零分布的方法。进行了蒙特卡罗模拟和实际数据分析,以证明所提出的方法与“合并”方法以及基于汇总分数的常规方差分析(ANOVA)检验相比的性能。结果发现,新方法能够在期望的错误发现率下成功检测DEG,并且比竞争方法更强大,尤其是在组数较少时。

可用性

稳健MANOVA软件包可从http://faculty.ucr.edu/~xpcui/software下载

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验