Billups Stephen C, Neville Margaret C, Rudolph Michael, Porter Weston, Schedin Pepper
Department of Mathematical and Statistical Sciences, University of Colorado, Denver, CO, USA.
BMC Bioinformatics. 2009 Mar 26;10:96. doi: 10.1186/1471-2105-10-96.
An important component of time course microarray studies is the identification of genes that demonstrate significant time-dependent variation in their expression levels. Until recently, available methods for performing such significance tests required replicates of individual time points. This paper describes a replicate-free method that was developed as part of a study of the estrous cycle in the rat mammary gland in which no replicate data was collected.
A temporal test statistic is proposed that is based on the degree to which data are smoothed when fit by a spline function. An algorithm is presented that uses this test statistic together with a false discovery rate method to identify genes whose expression profiles exhibit significant temporal variation. The algorithm is tested on simulated data, and is compared with another recently published replicate-free method. The simulated data consists both of genes with known temporal dependencies, and genes from a null distribution. The proposed algorithm identifies a larger percentage of the time-dependent genes for a given false discovery rate. Use of the algorithm in a study of the estrous cycle in the rat mammary gland resulted in the identification of genes exhibiting distinct circadian variation. These results were confirmed in follow-up laboratory experiments.
The proposed algorithm provides a new approach for identifying expression profiles with significant temporal variation without relying on replicates. When compared with a recently published algorithm on simulated data, the proposed algorithm appears to identify a larger percentage of time-dependent genes for a given false discovery rate. The development of the algorithm was instrumental in revealing the presence of circadian variation in the virgin rat mammary gland during the estrous cycle.
时间进程微阵列研究的一个重要组成部分是识别那些在表达水平上呈现出显著时间依赖性变化的基因。直到最近,用于进行此类显著性检验的现有方法都需要各个时间点的重复样本。本文描述了一种无需重复样本的方法,该方法是作为大鼠乳腺发情周期研究的一部分而开发的,在该研究中未收集重复数据。
提出了一种基于样条函数拟合数据时的平滑程度的时间检验统计量。给出了一种算法,该算法将此检验统计量与错误发现率方法结合使用,以识别其表达谱呈现出显著时间变化的基因。该算法在模拟数据上进行了测试,并与另一种最近发表的无需重复样本的方法进行了比较。模拟数据既包括具有已知时间依赖性的基因,也包括来自零分布的基因。对于给定的错误发现率,所提出的算法能识别出更大比例的时间依赖性基因。在大鼠乳腺发情周期研究中使用该算法,结果识别出了呈现出明显昼夜节律变化的基因。这些结果在后续的实验室实验中得到了证实。
所提出的算法提供了一种无需依赖重复样本即可识别具有显著时间变化的表达谱的新方法。与最近在模拟数据上发表的算法相比,对于给定的错误发现率,所提出的算法似乎能识别出更大比例的时间依赖性基因。该算法的开发有助于揭示处女大鼠乳腺在发情周期中存在昼夜节律变化。