Department of Plant Sciences, University of Oxford, Oxford, UK.
Brief Bioinform. 2018 Jul 20;19(4):622-626. doi: 10.1093/bib/bbw143.
RNA-Seq technology has been gradually becoming a routine approach for characterizing the properties of transcriptome in terms of organisms, cell types and conditions and consequently a big burden has been put on the facet of data analysis, which calls for an easy-to-learn workflow to cope with the increased demands from a large number of laboratories across the world. We report a one-in-all solution called hppRNA, composed of four scenarios such as pre-mapping, core-workflow, post-mapping and sequence variation detection, written by a series of individual Perl and R scripts, counting on well-established and preinstalled software, irrespective of single-end or paired-end, unstranded or stranded sequencing method. It features six independent core-workflows comprising the state-of-the-art technology with dozens of popular cutting-edge tools such as Tophat-Cufflink-Cuffdiff, Subread-featureCounts-DESeq2, STAR-RSEM-EBSeq, Bowtie-eXpress-edgeR, kallisto-sleuth, HISAT-StringTie-Ballgown, and embeds itself in Snakemake, which is a modern pipeline management system. The core function of this pipeline is turning the raw fastq files into gene/isoform expression matrix and differentially expressed genes or isoforms as well as the identification of fusion genes, single nucleotide polymorphisms, long noncoding RNAs and circular RNAs. Last but not least, this pipeline is specifically designed for performing the systematic analysis on a huge set of samples in one go, ideally for the researchers who intend to deploy the pipeline on their local servers. The scripts as well as the user manual are freely available at https://sourceforge.net/projects/hpprna/.
RNA-Seq 技术已逐渐成为一种常规方法,用于从生物、细胞类型和条件等方面描述转录组的特性,因此数据分析方面的负担很大,这就需要一个易于学习的工作流程来满足来自世界各地大量实验室的增加需求。我们报告了一种称为 hppRNA 的一站式解决方案,它由预映射、核心工作流程、后映射和序列变异检测四个场景组成,由一系列单独的 Perl 和 R 脚本编写,依赖于成熟和预安装的软件,无论是单端还是双端、无链或链测序方法。它具有六个独立的核心工作流程,包含最先进的技术,以及数十种流行的尖端工具,如 Tophat-Cufflink-Cuffdiff、Subread-featureCounts-DESeq2、STAR-RSEM-EBSeq、Bowtie-eXpress-edgeR、kallisto-sleuth、HISAT-StringTie-Ballgown,并嵌入 Snakemake 中,这是一种现代的管道管理系统。该管道的核心功能是将原始的 fastq 文件转换为基因/异构体表达矩阵以及差异表达的基因或异构体,以及融合基因、单核苷酸多态性、长非编码 RNA 和环状 RNA 的鉴定。最后但同样重要的是,该管道是专门为一次性对大量样本进行系统分析而设计的,非常适合那些打算在本地服务器上部署该管道的研究人员。脚本和用户手册可在 https://sourceforge.net/projects/hpprna/ 免费获得。