Suppr超能文献

基于可交换性和借势的微阵列数据非参数方法。

Nonparametric methods for microarray data based on exchangeability and borrowed power.

作者信息

Lee Mei-Ling Ting, Whitmore G A, Björkbacka Harry, Freeman Mason W

机构信息

Department of Medicine, Brigham and Women's Hospital, Boston, Massachusetts, USA.

出版信息

J Biopharm Stat. 2005;15(5):783-97. doi: 10.1081/BIP-200067778.

Abstract

This article proposes nonparametric inference procedures for analyzing microarray gene expression data that are reliable, robust, and simple to implement. They are conceptually transparent and require no special-purpose software. The analysis begins by normalizing gene expression data in a unique way. The resulting adjusted observations consist of gene-treatment interaction terms (representing differential expression) and error terms. The error terms are considered to be exchangeable, which is the only substantial assumption. Thus, under a family null hypothesis of no differential expression, the adjusted observations are exchangeable and all permutations of the observations are equally probable. The investigator may use the adjusted observations directly in a distribution-free test method or use their ranks in a rank-based method, where the ranking is taken over the whole data set. For the latter, the essential steps are as follows: (1) Calculate a Wilcoxon rank-sum difference or a corresponding Kruskal-Wallis rank statistic for each gene. (2) Randomly permute the observations and repeat the previous step. (3) Independently repeat the random permutation a suitable number of times. Under the exchangeability assumption, the permutation statistics are independent random draws from a null cumulative distribution function (c.d.f) approximated by the empirical c.d.f Reference to the empirical c.d.f tells if the test statistic for a gene is outlying and, hence, shows differential expression. This feature is judged by using an appropriate rejection region or computing a p-value for each test statistic, taking into account multiple testing. The distribution-free analog of the rank-based approach is also available and has parallel steps which are described in the article. The proposed nonparametric analysis tends to give good results with no additional refinement, although a few refinements are presented that may interest some investigators. The implementation is illustrated with a case application involving differential gene expression in wild-type and knockout mice of an E. coli lipopolysaccharide (LPS) endotoxin treatment, relative to a baseline untreated condition.

摘要

本文提出了用于分析微阵列基因表达数据的非参数推断程序,这些程序可靠、稳健且易于实施。它们在概念上清晰易懂,无需专用软件。分析从以独特方式对基因表达数据进行归一化开始。得到的调整后的观测值由基因 - 处理相互作用项(代表差异表达)和误差项组成。误差项被认为是可交换的,这是唯一的实质性假设。因此,在无差异表达的总体原假设下,调整后的观测值是可交换的,并且观测值的所有排列可能性相同。研究者可以直接在无分布检验方法中使用调整后的观测值,或者在基于秩的方法中使用它们的秩,其中排序是在整个数据集上进行的。对于后者,基本步骤如下:(1) 为每个基因计算 Wilcoxon 秩和差异或相应的 Kruskal - Wallis 秩统计量。(2) 对观测值进行随机排列并重复上一步。(3) 独立地重复随机排列适当次数。在可交换性假设下,排列统计量是从由经验累积分布函数近似的原累积分布函数中独立随机抽取的。参考经验累积分布函数可以判断某个基因的检验统计量是否异常,从而表明差异表达。通过使用适当的拒绝域或为每个检验统计量计算 p 值(考虑多重检验)来判断这一特征。基于秩的方法的无分布类似方法也可用,并且具有本文中描述的并行步骤。所提出的非参数分析在无需额外改进的情况下往往能给出良好结果,不过也给出了一些可能会引起一些研究者兴趣的改进方法。通过一个案例应用说明了该方法的实施过程,该案例涉及大肠杆菌脂多糖(LPS)内毒素处理的野生型和基因敲除小鼠相对于未处理基线条件下的差异基因表达。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验