Division of Experimental Medicine, Department of Medicine, University of California, San Francisco, California, 94143, USA.
Department of Biology, University of Washington, Seattle, WA, 98195, USA.
BMC Bioinformatics. 2018 Nov 14;19(1):423. doi: 10.1186/s12859-018-2445-2.
RNA-Sequencing analysis methods are rapidly evolving, and the tool choice for each step of one common workflow, differential expression analysis, which includes read alignment, expression modeling, and differentially expressed gene identification, has a dramatic impact on performance characteristics. Although a number of workflows are emerging as high performers that are robust to diverse input types, the relative performance characteristics of these workflows when either read depth or sample number is limited-a common occurrence in real-world practice-remain unexplored.
Here, we evaluate the impact of varying read depth and sample number on the performance of differential gene expression identification workflows, as measured by precision, or the fraction of genes correctly identified as differentially expressed, and by recall, or the fraction of differentially expressed genes identified. We focus our analysis on 30 high-performing workflows, systematically varying the read depth and number of biological replicates of patient monocyte samples provided as input. We find that, in general for most workflows, read depth has little effect on workflow performance when held above two million reads per sample, with reduced workflow performance below this threshold. The greatest impact of decreased sample number is seen below seven samples per group, when more heterogeneity in workflow performance is observed. The choice of differential expression identification tool, in particular, has a large impact on the response to limited inputs.
Among the tested workflows, the recall/precision balance remains relatively stable at a range of read depths and sample numbers, although some workflows are more sensitive to input restriction. At ranges typically recommended for biological studies, performance is more greatly impacted by the number of biological replicates than by read depth. Caution should be used when selecting analysis workflows and interpreting results from low sample number experiments, as all workflows exhibit poorer performance at lower sample numbers near typically reported values, with variable impact on recall versus precision. These analyses highlight the performance characteristics of common differential gene expression workflows at varying read depths and sample numbers, and provide empirical guidance in experimental and analytical design.
RNA 测序分析方法正在迅速发展,在一个常见工作流程(差异表达分析)的每个步骤中选择工具,包括读对齐、表达建模和差异表达基因识别,对性能特征有巨大影响。尽管有许多工作流程作为高性能流程出现,对不同输入类型具有稳健性,但在实际实践中常见的读取深度或样本数量有限的情况下,这些工作流程的相对性能特征仍未得到探索。
在这里,我们评估了变化的读取深度和样本数量对差异基因表达识别工作流程性能的影响,以精度(正确识别为差异表达的基因的分数)或召回率(识别的差异表达基因的分数)来衡量。我们专注于 30 个高性能工作流程,系统地改变了提供作为输入的患者单核细胞样本的读取深度和生物学重复数量。我们发现,一般来说,对于大多数工作流程,当读取深度保持在每个样本超过两百万个读取时,对工作流程性能的影响很小,低于此阈值时工作流程性能降低。样本数量减少的最大影响在每组低于七个样本时观察到,此时工作流程性能的异质性更大。差异表达识别工具的选择,特别是对有限输入的响应有很大影响。
在所测试的工作流程中,在一系列读取深度和样本数量下,召回率/精度平衡相对稳定,尽管一些工作流程对输入限制更为敏感。在通常推荐用于生物学研究的范围内,性能受生物重复数量的影响大于读取深度。在选择分析工作流程和解释低样本数量实验结果时应谨慎,因为所有工作流程在接近通常报告值的低样本数量下表现出较差的性能,对召回率与精度的影响不同。这些分析突出了常见差异基因表达工作流程在不同读取深度和样本数量下的性能特征,并在实验和分析设计中提供了经验指导。