Golan David, Rosset Saharon
School of Mathematical Sciences, Tel Aviv University, Tel Aviv, Israel.
Methods Mol Biol. 2013;1038:61-79. doi: 10.1007/978-1-62703-514-9_4.
In high-throughput sequencing experiments, the number of reads mapping to a genomic region, also known as the "coverage" or "coverage depth," is often used as a proxy for the abundance of the underlying genomic region in the sample. The abundance, in turn, can be used for many purposes including calling SNPs, estimating the allele frequency in a pool of individuals, identifying copy number variations, and identifying differentially expressed shRNAs in shRNA-seq experiments.In this chapter we describe the fundamentals of statistical modeling of coverage depth and discuss the problems of estimation and inference in the relevant experimental scenarios.
在高通量测序实验中,映射到基因组区域的 reads 数量,也称为“覆盖度”或“覆盖深度”,通常被用作样本中潜在基因组区域丰度的替代指标。反过来,该丰度可用于多种目的,包括 SNP 检测、估计个体池中等位基因频率、识别拷贝数变异以及在 shRNA-seq 实验中识别差异表达的 shRNA。在本章中,我们描述了覆盖深度统计建模的基本原理,并讨论了相关实验场景中的估计和推断问题。