Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
Cancer Center and Department of Pathology, Massachusetts General Hospital, Boston, MA 02114, USA.
Bioinformatics. 2021 Sep 29;37(18):3048-3050. doi: 10.1093/bioinformatics/btab135.
Post-sequencing quality control is a crucial component of RNA sequencing (RNA-seq) data generation and analysis, as sample quality can be affected by sample storage, extraction and sequencing protocols. RNA-seq is increasingly applied to cohorts ranging from hundreds to tens of thousands of samples in size, but existing tools do not readily scale to these sizes, and were not designed for a wide range of sample types and qualities. Here, we describe RNA-SeQC 2, an efficient reimplementation of RNA-SeQC (DeLuca et al., 2012) that adds multiple metrics designed to characterize sample quality across a wide range of RNA-seq protocols.
The command-line tool, documentation and C++ source code are available at the GitHub repository https://github.com/getzlab/rnaseqc. Code and data for reproducing the figures in this paper are available at https://github.com/getzlab/rnaseqc2-paper.
Supplementary data are available at Bioinformatics online.
测序后质量控制是 RNA 测序 (RNA-seq) 数据生成和分析的关键组成部分,因为样品质量可能会受到样品储存、提取和测序方案的影响。RNA-seq 越来越多地应用于大小从数百到数万样本的队列中,但现有的工具不容易扩展到这些大小,并且不是为广泛的样本类型和质量设计的。在这里,我们描述了 RNA-SeQC 2,这是 RNA-SeQC 的一种高效重新实现(DeLuca 等人,2012),它增加了多个指标,旨在描述广泛的 RNA-seq 方案中的样品质量。
命令行工具、文档和 C++源代码可在 GitHub 存储库 https://github.com/getzlab/rnaseqc 上获得。本文中图形的重现代码和数据可在 https://github.com/getzlab/rnaseqc2-paper 上获得。
补充数据可在生物信息学在线获得。