Centro Regional de Estudios Genómicos, Universidad Nacional de La Plata, Florencio Varela, Argentina.
PLoS One. 2011;6(10):e26291. doi: 10.1371/journal.pone.0026291. Epub 2011 Oct 18.
The microarray technique allows the simultaneous measurements of the expression levels of thousands of mRNAs. By mining these data one can identify the dynamics of the gene expression time series. The detection of genes that are periodically expressed is an important step that allows us to study the regulatory mechanisms associated with the circadian cycle. The problem of finding periodicity in biological time series poses many challenges. Such challenge occurs due to the fact that the observed time series usually exhibit non-idealities, such as noise, short length, outliers and unevenly sampled time points. Consequently, the method for finding periodicity should preferably be robust against such anomalies in the data. In this paper, we propose a general and robust procedure for identifying genes with a periodic signature at a given significance level. This identification method is based on autoregressive models and the information theory. By using simulated data we show that the suggested method is capable of identifying rhythmic profiles even in the presence of noise and when the number of data points is small. By recourse of our analysis, we uncover the circadian rhythmic patterns underlying the gene expression profiles from Cyanobacterium Synechocystis.
微阵列技术允许同时测量数千个 mRNA 的表达水平。通过挖掘这些数据,我们可以识别基因表达时间序列的动态。周期性表达基因的检测是一个重要的步骤,它使我们能够研究与昼夜节律周期相关的调控机制。在生物时间序列中寻找周期性的问题带来了许多挑战。由于观察到的时间序列通常表现出不理想的情况,例如噪声、短长度、异常值和不均匀采样的时间点,因此,寻找周期性的方法最好能够抵抗数据中的这些异常。在本文中,我们提出了一种通用且稳健的方法,用于在给定的显著性水平下识别具有周期性特征的基因。这种识别方法基于自回归模型和信息论。通过使用模拟数据,我们表明,即使在存在噪声和数据点数量较少的情况下,所提出的方法也能够识别有节奏的轮廓。通过我们的分析,我们揭示了蓝藻集胞藻基因表达谱下的昼夜节律模式。