Suppr超能文献

使用混合混合模型估计差异表达基因的倍数变化

Fold-change estimation of differentially expressed genes using mixture mixed-model.

作者信息

Gusnanto Arief, Ploner Alexander, Pawitan Yudi

机构信息

Medical Research Council-Biostatistics Unit, Institute of Public Health, Cambridge CB2 2SR, United Kingdom.

出版信息

Stat Appl Genet Mol Biol. 2005;4:Article26. doi: 10.2202/1544-6115.1145. Epub 2005 Sep 21.

Abstract

Microarray experiments produce expression measurements for thousands of genes simultaneously, though usually for a small number of RNA samples. The most common problem is the identification of genes that are differentially expressed between different groups of samples or biological conditions. As the number of genes far exceeds the number of RNA samples, the inherent multiplicity poses a severe problem in both hypothesis testing and effect estimation. While much of the recent literature is focused on the hypothesis aspects, we concentrate in this paper on effect estimation as a tool for the identification of differentially expressed genes. We propose a linear mixed model where the random effects are assumed to follow a mixture distribution, and study in detail the case of three normals, corresponding to genes that are down-, up- or non regulated. Our approach leads to a new type of non-linear shrinkage estimation, where a proportion of estimates is shrunk to zero, while the rest follows standard linear shrinkage. This allows us to estimate the log fold-change of the genes involved and to identify those that are differentially expressed within the same model framework. We investigate the operating characteristics of our method using simulation and spike-in studies, and illustrate its application to real data using a breast-cancer dataset.

摘要

微阵列实验可同时对数千个基因进行表达测量,不过通常针对的是少量RNA样本。最常见的问题是识别在不同样本组或生物学条件之间差异表达的基因。由于基因数量远远超过RNA样本数量,内在的多重性在假设检验和效应估计中都构成了严重问题。尽管近期的许多文献都聚焦于假设方面,但在本文中我们将重点放在效应估计上,将其作为识别差异表达基因的一种工具。我们提出一种线性混合模型,其中假定随机效应服从混合分布,并详细研究对应于下调、上调或无调控基因的三个正态分布的情况。我们的方法导致了一种新型的非线性收缩估计,其中一部分估计值被收缩至零,而其余部分遵循标准线性收缩。这使我们能够估计所涉及基因的对数变化倍数,并在同一模型框架内识别那些差异表达的基因。我们使用模拟和掺入研究来研究我们方法的操作特性,并使用乳腺癌数据集说明其在实际数据中的应用。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验