Department of Biostatistics and Bioinformatics, Duke University Medical Center, Durham, North Carolina 27710, USA.
BMC Bioinformatics. 2009 Oct 15;10:336. doi: 10.1186/1471-2105-10-336.
Time-course microarray experiments are widely used to study the temporal profiles of gene expression. Storey et al. (2005) developed a method for analyzing time-course microarray studies that can be applied to discovering genes whose expression trajectories change over time within a single biological group, or those that follow different time trajectories among multiple groups. They estimated the expression trajectories of each gene using natural cubic splines under the null (no time-course) and alternative (time-course) hypotheses, and used a goodness of fit test statistic to quantify the discrepancy. The null distribution of the statistic was approximated through a bootstrap method. Gene expression levels in microarray data are often complicatedly correlated. An accurate type I error control adjusting for multiple testing requires the joint null distribution of test statistics for a large number of genes. For this purpose, permutation methods have been widely used because of computational ease and their intuitive interpretation.
In this paper, we propose a permutation-based multiple testing procedure based on the test statistic used by Storey et al. (2005). We also propose an efficient computation algorithm. Extensive simulations are conducted to investigate the performance of the permutation-based multiple testing procedure. The application of the proposed method is illustrated using the Caenorhabditis elegans dauer developmental data.
Our method is computationally efficient and applicable for identifying genes whose expression levels are time-dependent in a single biological group and for identifying the genes for which the time-profile depends on the group in a multi-group setting.
时间进程微阵列实验被广泛用于研究基因表达的时间进程。Storey 等人(2005)开发了一种分析时间进程微阵列研究的方法,该方法可用于发现单个生物群体中基因表达轨迹随时间变化的基因,或在多个群体中遵循不同时间轨迹的基因。他们在零假设(无时间进程)和替代假设(时间进程)下使用自然三次样条估计每个基因的表达轨迹,并使用拟合优度检验统计量来量化差异。该统计量的零分布通过自举方法进行逼近。微阵列数据中的基因表达水平通常存在复杂的相关性。为了准确控制多重检验的类型 I 错误,需要对大量基因的检验统计量进行联合零分布。为此,由于计算简便和直观的解释,置换方法被广泛使用。
在本文中,我们提出了一种基于 Storey 等人(2005)使用的检验统计量的置换多重检验程序。我们还提出了一种有效的计算算法。进行了广泛的模拟研究以研究置换多重检验程序的性能。通过使用秀丽隐杆线虫 dauer 发育数据来说明所提出方法的应用。
我们的方法计算效率高,适用于识别单个生物群体中基因表达水平随时间变化的基因,以及在多群体设置中识别基因表达时间谱依赖于群体的基因。