D'Antonio Mattia, D'Onorio De Meo Paolo, Pallocca Matteo, Picardi Ernesto, D'Erchia Anna Maria, Calogero Raffaele A, Castrignanò Tiziana, Pesole Graziano
BMC Genomics. 2015;16(Suppl 6):S3. doi: 10.1186/1471-2164-16-S6-S3. Epub 2015 Jun 1.
The study of RNA has been dramatically improved by the introduction of Next Generation Sequencing platforms allowing massive and cheap sequencing of selected RNA fractions, also providing information on strand orientation (RNA-Seq). The complexity of transcriptomes and of their regulative pathways make RNA-Seq one of most complex field of NGS applications, addressing several aspects of the expression process (e.g. identification and quantification of expressed genes and transcripts, alternative splicing and polyadenylation, fusion genes and trans-splicing, post-transcriptional events, etc.).
In order to provide researchers with an effective and friendly resource for analyzing RNA-Seq data, we present here RAP (RNA-Seq Analysis Pipeline), a cloud computing web application implementing a complete but modular analysis workflow. This pipeline integrates both state-of-the-art bioinformatics tools for RNA-Seq analysis and in-house developed scripts to offer to the user a comprehensive strategy for data analysis. RAP is able to perform quality checks (adopting FastQC and NGS QC Toolkit), identify and quantify expressed genes and transcripts (with Tophat, Cufflinks and HTSeq), detect alternative splicing events (using SpliceTrap) and chimeric transcripts (with ChimeraScan). This pipeline is also able to identify splicing junctions and constitutive or alternative polyadenylation sites (implementing custom analysis modules) and call for statistically significant differences in genes and transcripts expression, splicing pattern and polyadenylation site usage (using Cuffdiff2 and DESeq).
Through a user friendly web interface, the RAP workflow can be suitably customized by the user and it is automatically executed on our cloud computing environment. This strategy allows to access to bioinformatics tools and computational resources without specific bioinformatics and IT skills. RAP provides a set of tabular and graphical results that can be helpful to browse, filter and export analyzed data, according to the user needs.
新一代测序平台的引入极大地推动了RNA研究,该平台能够对选定的RNA片段进行大规模且低成本的测序,还能提供链方向信息(RNA测序)。转录组及其调控途径的复杂性使得RNA测序成为NGS应用中最复杂的领域之一,涉及表达过程的多个方面(例如,已表达基因和转录本的鉴定与定量、可变剪接和多聚腺苷酸化、融合基因和反式剪接、转录后事件等)。
为了为研究人员提供一个有效且友好的RNA测序数据分析资源,我们在此展示RAP(RNA测序分析管道),这是一个云计算网络应用程序,实现了完整但模块化的分析工作流程。该管道整合了用于RNA测序分析的最新生物信息学工具和内部开发的脚本,为用户提供全面的数据分析策略。RAP能够进行质量检查(采用FastQC和NGS QC Toolkit)、鉴定和定量已表达基因和转录本(使用Tophat、Cufflinks和HTSeq)、检测可变剪接事件(使用SpliceTrap)和嵌合转录本(使用ChimeraScan)。该管道还能够识别剪接连接以及组成型或可变多聚腺苷酸化位点(通过实现定制分析模块),并调用基因和转录本表达、剪接模式和多聚腺苷酸化位点使用方面的统计学显著差异(使用Cuffdiff2和DESeq)。
通过用户友好的网络界面,用户可以对RAP工作流程进行适当定制,并在我们的云计算环境中自动执行。这种策略使得无需特定的生物信息学和信息技术技能就能访问生物信息学工具和计算资源。RAP提供了一组表格和图形结果,有助于根据用户需求浏览、筛选和导出分析数据。