利用芯片内重复点评估微阵列实验中的差异表达。
Use of within-array replicate spots for assessing differential expression in microarray experiments.
作者信息
Smyth Gordon K, Michaud Joëlle, Scott Hamish S
机构信息
Walter and Eliza Hall Institute of Medical Research, Melbourne, Vic, Australia.
出版信息
Bioinformatics. 2005 May 1;21(9):2067-75. doi: 10.1093/bioinformatics/bti270. Epub 2005 Jan 18.
MOTIVATION
Spotted arrays are often printed with probes in duplicate or triplicate, but current methods for assessing differential expression are not able to make full use of the resulting information. The usual practice is to average the duplicate or triplicate results for each probe before assessing differential expression. This results in the loss of valuable information about genewise variability.
RESULTS
A method is proposed for extracting more information from within-array replicate spots in microarray experiments by estimating the strength of the correlation between them. The method involves fitting separate linear models to the expression data for each gene but with a common value for the between-replicate correlation. The method greatly improves the precision with which the genewise variances are estimated and thereby improves inference methods designed to identify differentially expressed genes. The method may be combined with empirical Bayes methods for moderating the genewise variances between genes. The method is validated using data from a microarray experiment involving calibration and ratio control spots in conjunction with spiked-in RNA. Comparing results for calibration and ratio control spots shows that the common correlation method results in substantially better discrimination of differentially expressed genes from those which are not. The spike-in experiment also confirms that the results may be further improved by empirical Bayes smoothing of the variances when the sample size is small.
AVAILABILITY
The methodology is implemented in the limma software package for R, available from the CRAN repository http://www.r-project.org
动机
斑点阵列通常会以一式两份或一式三份的方式打印探针,但目前用于评估差异表达的方法无法充分利用由此产生的信息。通常的做法是在评估差异表达之前,对每个探针的一式两份或一式三份结果求平均值。这导致了关于基因特异性变异性的宝贵信息的丢失。
结果
提出了一种通过估计微阵列实验中阵列内重复斑点之间的相关性强度来从其中提取更多信息的方法。该方法包括为每个基因的表达数据拟合单独的线性模型,但重复间相关性具有共同值。该方法极大地提高了估计基因特异性方差的精度,从而改进了旨在识别差异表达基因的推理方法。该方法可以与经验贝叶斯方法相结合,以调节基因之间的基因特异性方差。使用来自涉及校准和比率对照斑点以及掺入RNA的微阵列实验的数据对该方法进行了验证。比较校准和比率对照斑点的结果表明,共同相关性方法在区分差异表达基因和非差异表达基因方面有显著更好的效果。掺入实验还证实,当样本量较小时,通过对方差进行经验贝叶斯平滑处理,结果可能会进一步改善。
可用性
该方法在用于R的limma软件包中实现,可从CRAN存储库http://www.r-project.org获得