Suppr超能文献

微阵列标准化程序的比较分析:对逆向工程基因网络的影响

Comparative analysis of microarray normalization procedures: effects on reverse engineering gene networks.

作者信息

Lim Wei Keat, Wang Kai, Lefebvre Celine, Califano Andrea

机构信息

Department of Biomedical Informatics, Columbia University, 622 West 168th Street, Vanderbilt Clinic 5th Floor, New York, NY 10032, USA.

出版信息

Bioinformatics. 2007 Jul 1;23(13):i282-8. doi: 10.1093/bioinformatics/btm201.

Abstract

MOTIVATION

An increasingly common application of gene expression profile data is the reverse engineering of cellular networks. However, common procedures to normalize expression profiles generated using the Affymetrix GeneChips technology were originally developed for a rather different purpose, namely the accurate measure of differential gene expression between two or more phenotypes. As a result, current evaluation strategies lack comprehensive metrics to assess the suitability of available normalization procedures for reverse engineering and, in general, for measuring correlation between the expression profiles of a gene pair.

RESULTS

We benchmark four commonly used normalization procedures (MAS5, RMA, GCRMA and Li-Wong) in the context of established algorithms for the reverse engineering of protein-protein and protein-DNA interactions. Replicate sample, randomized and human B-cell data sets are used as an input. Surprisingly, our study suggests that MAS5 provides the most faithful cellular network reconstruction. Furthermore, we identify a crucial step in GCRMA responsible for introducing severe artifacts in the data leading to a systematic overestimate of pairwise correlation. This has key implications not only for reverse engineering but also for other methods, such as hierarchical clustering, relying on accurate measurements of pairwise expression profile correlation. We propose an alternative implementation to eliminate such side effect.

摘要

动机

基因表达谱数据越来越常见的应用是细胞网络的反向工程。然而,使用Affymetrix基因芯片技术生成的表达谱的常见归一化程序最初是为了一个截然不同的目的而开发的,即准确测量两种或更多种表型之间的差异基因表达。因此,当前的评估策略缺乏全面的指标来评估现有归一化程序对反向工程以及一般而言对测量基因对表达谱之间相关性的适用性。

结果

我们在用于蛋白质 - 蛋白质和蛋白质 - DNA相互作用反向工程的既定算法背景下,对四种常用的归一化程序(MAS5、RMA、GCRMA和Li - Wong)进行了基准测试。复制样本、随机和人类B细胞数据集用作输入。令人惊讶的是,我们的研究表明MAS5提供了最忠实的细胞网络重建。此外,我们确定了GCRMA中一个关键步骤,该步骤会在数据中引入严重的伪影,导致成对相关性的系统性高估。这不仅对反向工程有关键影响,而且对其他依赖于成对表达谱相关性准确测量的方法(如层次聚类)也有关键影响。我们提出了一种替代实现方法来消除这种副作用。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验