Stegle Oliver, Denby Katherine J, Cooke Emma J, Wild David L, Ghahramani Zoubin, Borgwardt Karsten M
Interdepartmental Bioinformatics Group, Max Planck Institute for Developmental Biology, Max Planck Institute for Biological Cybernetics, Tübingen, Germany.
J Comput Biol. 2010 Mar;17(3):355-67. doi: 10.1089/cmb.2009.0175.
Understanding the regulatory mechanisms that are responsible for an organism's response to environmental change is an important issue in molecular biology. A first and important step towards this goal is to detect genes whose expression levels are affected by altered external conditions. A range of methods to test for differential gene expression, both in static as well as in time-course experiments, have been proposed. While these tests answer the question whether a gene is differentially expressed, they do not explicitly address the question when a gene is differentially expressed, although this information may provide insights into the course and causal structure of regulatory programs. In this article, we propose a two-sample test for identifying intervals of differential gene expression in microarray time series. Our approach is based on Gaussian process regression, can deal with arbitrary numbers of replicates, and is robust with respect to outliers. We apply our algorithm to study the response of Arabidopsis thaliana genes to an infection by a fungal pathogen using a microarray time series dataset covering 30,336 gene probes at 24 observed time points. In classification experiments, our test compares favorably with existing methods and provides additional insights into time-dependent differential expression.
理解负责生物体对环境变化做出反应的调控机制是分子生物学中的一个重要问题。朝着这一目标迈出的首要且重要的一步是检测其表达水平受外部条件改变影响的基因。已经提出了一系列用于在静态以及时间进程实验中测试差异基因表达的方法。虽然这些测试回答了一个基因是否差异表达的问题,但它们没有明确解决一个基因何时差异表达的问题,尽管这些信息可能有助于深入了解调控程序的过程和因果结构。在本文中,我们提出了一种用于识别微阵列时间序列中差异基因表达区间的双样本检验。我们的方法基于高斯过程回归,可以处理任意数量的重复样本,并且对异常值具有鲁棒性。我们应用我们的算法,使用一个在24个观测时间点覆盖30336个基因探针的微阵列时间序列数据集,研究拟南芥基因对真菌病原体感染的反应。在分类实验中,我们的检验与现有方法相比具有优势,并提供了关于时间依赖性差异表达的更多见解。