Suppr超能文献

使用芯片内重复数据对c-DNA微阵列标准化方法进行选择与验证

Selection and validation of normalization methods for c-DNA microarrays using within-array replications.

作者信息

Fan Jianqing, Niu Yue

机构信息

Department of Operations Research and Financial Engineering Princeton University, Princeton, NJ 08544, USA.

出版信息

Bioinformatics. 2007 Sep 15;23(18):2391-8. doi: 10.1093/bioinformatics/btm361. Epub 2007 Jul 27.

Abstract

MOTIVATION

Normalization of microarray data is essential for multiple-array analyses. Several normalization protocols have been proposed based on different biological or statistical assumptions. A fundamental problem arises whether they have effectively normalized arrays. In addition, for a given array, the question arises how to choose a method to most effectively normalize the microarray data.

RESULTS

We propose several techniques to compare the effectiveness of different normalization methods. We approach the problem by constructing statistics to test whether there are any systematic biases in the expression profiles among duplicated spots within an array. The test statistics involve estimating the genewise variances. This is accomplished by using several novel methods, including empirical Bayes methods for moderating the genewise variances and the smoothing methods for aggregating variance information. P-values are estimated based on a normal or chi approximation. With estimated P-values, we can choose a most appropriate method to normalize a specific array and assess the extent to which the systematic biases due to the variations of experimental conditions have been removed. The effectiveness and validity of the proposed methods are convincingly illustrated by a carefully designed simulation study. The method is further illustrated by an application to human placenta cDNAs comprising a large number of clones with replications, a customized microarray experiment carrying just a few hundred genes on the study of the molecular roles of Interferons on tumor, and the Agilent microarrays carrying tens of thousands of total RNA samples in the MAQC project on the study of reproducibility, sensitivity and specificity of the data.

AVAILABILITY

Code to implement the method in the statistical package R is available from the authors.

摘要

动机

微阵列数据的标准化对于多阵列分析至关重要。基于不同的生物学或统计学假设,已经提出了几种标准化方案。一个基本问题是它们是否有效地标准化了阵列。此外,对于给定的阵列,还存在如何选择一种方法来最有效地标准化微阵列数据的问题。

结果

我们提出了几种技术来比较不同标准化方法的有效性。我们通过构建统计量来解决这个问题,以测试阵列内重复点之间的表达谱中是否存在任何系统偏差。测试统计量涉及估计基因特异性方差。这是通过使用几种新颖的方法来完成的,包括用于调节基因特异性方差的经验贝叶斯方法和用于汇总方差信息的平滑方法。基于正态或卡方近似估计P值。利用估计的P值,我们可以选择最合适的方法来标准化特定阵列,并评估由于实验条件变化导致的系统偏差被消除的程度。精心设计的模拟研究令人信服地说明了所提出方法的有效性和有效性。通过将该方法应用于包含大量具有重复克隆的人胎盘cDNA、在干扰素对肿瘤的分子作用研究中仅携带数百个基因的定制微阵列实验以及在MAQC项目中用于研究数据的可重复性、敏感性和特异性的携带数万个总RNA样本的安捷伦微阵列,进一步说明了该方法。

可用性

作者提供了在统计软件包R中实现该方法的代码。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验