Suppr超能文献

SPEAQeasy:一个用于 R/bioconductor 驱动的 RNA-seq 分析中表达分析和定量的可扩展流水线。

SPEAQeasy: a scalable pipeline for expression analysis and quantification for R/bioconductor-powered RNA-seq analyses.

机构信息

Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, 21205, USA.

Winter Genomics, Salaverry 874 int 100, Lindavista, CDMX, 07300, Mexico.

出版信息

BMC Bioinformatics. 2021 May 1;22(1):224. doi: 10.1186/s12859-021-04142-3.

Abstract

BACKGROUND

RNA sequencing (RNA-seq) is a common and widespread biological assay, and an increasing amount of data is generated with it. In practice, there are a large number of individual steps a researcher must perform before raw RNA-seq reads yield directly valuable information, such as differential gene expression data. Existing software tools are typically specialized, only performing one step-such as alignment of reads to a reference genome-of a larger workflow. The demand for a more comprehensive and reproducible workflow has led to the production of a number of publicly available RNA-seq pipelines. However, we have found that most require computational expertise to set up or share among several users, are not actively maintained, or lack features we have found to be important in our own analyses.

RESULTS

In response to these concerns, we have developed a Scalable Pipeline for Expression Analysis and Quantification (SPEAQeasy), which is easy to install and share, and provides a bridge towards R/Bioconductor downstream analysis solutions. SPEAQeasy is portable across computational frameworks (SGE, SLURM, local, docker integration) and different configuration files are provided ( http://research.libd.org/SPEAQeasy/ ).

CONCLUSIONS

SPEAQeasy is user-friendly and lowers the computational-domain entry barrier for biologists and clinicians to RNA-seq data processing as the main input file is a table with sample names and their corresponding FASTQ files. The goal is to provide a flexible pipeline that is immediately usable by researchers, regardless of their technical background or computing environment.

摘要

背景

RNA 测序(RNA-seq)是一种常见且广泛应用的生物学检测方法,其产生的数据量也在不断增加。在实践中,在原始 RNA-seq 读取产生直接有价值的信息(如差异基因表达数据)之前,研究人员必须执行大量的单个步骤。现有的软件工具通常是专门的,仅执行较大工作流程中的一个步骤,例如读取与参考基因组的比对。对更全面和可重复的工作流程的需求导致了许多公共可用的 RNA-seq 管道的产生。然而,我们发现大多数都需要计算专业知识来设置或在多个用户之间共享,没有得到积极维护,或者缺少我们在自己的分析中发现的重要功能。

结果

针对这些问题,我们开发了一种用于表达分析和定量的可扩展管道(Scalable Pipeline for Expression Analysis and Quantification,SPEAQeasy),它易于安装和共享,并为 R/Bioconductor 下游分析解决方案提供了一个桥梁。SPEAQeasy 可在计算框架(SGE、SLURM、本地、docker 集成)之间移植,并且提供了不同的配置文件(http://research.libd.org/SPEAQeasy/)。

结论

SPEAQeasy 易于使用,降低了生物学家和临床医生对 RNA-seq 数据处理的计算领域进入门槛,因为主要输入文件是一个包含样本名称及其对应的 FASTQ 文件的表格。目标是提供一个灵活的管道,无论研究人员的技术背景或计算环境如何,都可以立即使用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f74/8088074/b58ea32a2d9c/12859_2021_4142_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验