European Molecular Biology Laboratory, Mayerhofstraße 1, 69117 Heidelberg, Germany.
Genome Biol. 2010;11(10):R106. doi: 10.1186/gb-2010-11-10-r106. Epub 2010 Oct 27.
High-throughput sequencing assays such as RNA-Seq, ChIP-Seq or barcode counting provide quantitative readouts in the form of count data. To infer differential signal in such data correctly and with good statistical power, estimation of data variability throughout the dynamic range and a suitable error model are required. We propose a method based on the negative binomial distribution, with variance and mean linked by local regression and present an implementation, DESeq, as an R/Bioconductor package.
高通量测序检测,如 RNA-Seq、ChIP-Seq 或条码计数,以计数数据的形式提供定量读数。为了正确地推断此类数据中的差异信号并具有良好的统计功效,需要对整个动态范围内的数据变异性进行估计,并使用合适的误差模型。我们提出了一种基于负二项分布的方法,通过局部回归将方差和均值联系起来,并作为 R/Bioconductor 包实现了一个名为 DESeq 的方法。