Department of Biostatistics, University of North Carolina-Chapel Hill, Chapel Hill, NC, USA.
Department of Genetics, University of North Carolina-Chapel Hill, Chapel Hill, NC, USA.
Methods Mol Biol. 2021;2284:97-134. doi: 10.1007/978-1-0716-1307-8_7.
Statistical modeling of count data from RNA sequencing (RNA-seq) experiments is important for proper interpretation of results. Here I will describe how count data can be modeled using count distributions, or alternatively analyzed using nonparametric methods. I will focus on basic routines for performing data input, scaling/normalization, visualization, and statistical testing to determine sets of features where the counts reflect differences in gene expression across samples. Finally, I discuss limitations and possible extensions to the models presented here.
RNA 测序(RNA-seq)实验中计数数据的统计建模对于正确解释结果非常重要。在这里,我将描述如何使用计数分布对计数数据进行建模,或者使用非参数方法对其进行分析。我将重点介绍执行数据输入、缩放/标准化、可视化和统计检验以确定反映样本中基因表达差异的特征集的基本例程。最后,我讨论了此处提出的模型的局限性和可能的扩展。