Department of Mathematical Modelling, Statistics and Bioinformatics, Ghent University, Coupure Links 653, B9000 Ghent, Belgium.
BMC Bioinformatics. 2012 Sep 14;13:234. doi: 10.1186/1471-2105-13-234.
Existing statistical methods for tiling array transcriptome data either focus on transcript discovery in one biological or experimental condition or on the detection of differential expression between two conditions. Increasingly often, however, biologists are interested in time-course studies, studies with more than two conditions or even multiple-factor studies. As these studies are currently analyzed with the traditional microarray analysis techniques, they do not exploit the genome-wide nature of tiling array data to its full potential.
We present an R Bioconductor package, waveTiling, which implements a wavelet-based model for analyzing transcriptome data and extends it towards more complex experimental designs. With waveTiling the user is able to discover (1) group-wise expressed regions, (2) differentially expressed regions between any two groups in single-factor studies and in (3) multifactorial designs. Moreover, for time-course experiments it is also possible to detect (4) linear time effects and (5) a circadian rhythm of transcripts. By considering the expression values of the individual tiling probes as a function of genomic position, effect regions can be detected regardless of existing annotation. Three case studies with different experimental set-ups illustrate the use and the flexibility of the model-based transcriptome analysis.
The waveTiling package provides the user with a convenient tool for the analysis of tiling array trancriptome data for a multitude of experimental set-ups. Regardless of the study design, the probe-wise analysis allows for the detection of transcriptional effects in both exonic, intronic and intergenic regions, without prior consultation of existing annotation.
现有的用于平铺阵列转录组数据的统计方法要么专注于一种生物学或实验条件下的转录本发现,要么专注于两种条件之间的差异表达检测。然而,越来越多的生物学家对时间过程研究、超过两种条件的研究甚至多因素研究感兴趣。由于这些研究目前是用传统的微阵列分析技术进行分析的,因此没有充分利用平铺阵列数据的全基因组性质。
我们提出了一个 R Bioconductor 包 waveTiling,它实现了一个基于小波的模型,用于分析转录组数据,并将其扩展到更复杂的实验设计中。使用 waveTiling,用户能够发现(1)组表达区域,(2)单因素研究中任意两个组之间的差异表达区域,以及(3)多因素设计中的差异表达区域。此外,对于时间过程实验,还可以检测(4)线性时间效应和(5)转录本的昼夜节律。通过将单个平铺探针的表达值作为基因组位置的函数来考虑,无论是否存在现有注释,都可以检测到效应区域。三个具有不同实验设置的案例研究说明了基于模型的转录组分析的用途和灵活性。
waveTiling 包为用户提供了一种方便的工具,用于分析多种实验设置的平铺阵列转录组数据。无论研究设计如何,探针级分析都允许在exon、intron 和 intergenic 区域中检测转录效应,而无需事先咨询现有注释。