Department of Microbiology, University of Pennsylvania, Philadelphia, PA, 19104, USA.
Division of Gastroenterology, Hepatology and Nutrition, The Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA.
Microbiome. 2019 Mar 22;7(1):46. doi: 10.1186/s40168-019-0658-x.
Analysis of mixed microbial communities using metagenomic sequencing experiments requires multiple preprocessing and analytical steps to interpret the microbial and genetic composition of samples. Analytical steps include quality control, adapter trimming, host decontamination, metagenomic classification, read assembly, and alignment to reference genomes.
We present a modular and user-extensible pipeline called Sunbeam that performs these steps in a consistent and reproducible fashion. It can be installed in a single step, does not require administrative access to the host computer system, and can work with most cluster computing frameworks. We also introduce Komplexity, a software tool to eliminate potentially problematic, low-complexity nucleotide sequences from metagenomic data. A unique component of the Sunbeam pipeline is an easy-to-use extension framework that enables users to add custom processing or analysis steps directly to the workflow. The pipeline and its extension framework are well documented, in routine use, and regularly updated.
Sunbeam provides a foundation to build more in-depth analyses and to enable comparisons in metagenomic sequencing experiments by removing problematic, low-complexity reads and standardizing post-processing and analytical steps. Sunbeam is written in Python using the Snakemake workflow management software and is freely available at github.com/sunbeam-labs/sunbeam under the GPLv3.
使用宏基因组测序实验分析混合微生物群落需要经过多个预处理和分析步骤,才能解释样本中的微生物和遗传组成。分析步骤包括质量控制、接头修剪、宿主去污染、宏基因组分类、读段组装和与参考基因组比对。
我们提出了一个模块化且可用户扩展的名为 Sunbeam 的管道,它以一致且可重复的方式执行这些步骤。它可以一步安装,不需要对主机计算机系统的管理访问权限,并且可以与大多数集群计算框架一起使用。我们还引入了 Komplexity,这是一种从宏基因组数据中消除潜在问题、低复杂度核苷酸序列的软件工具。Sunbeam 管道的一个独特组件是一个易于使用的扩展框架,它使用户可以直接将自定义处理或分析步骤添加到工作流程中。该管道及其扩展框架有详细的文档说明,已在常规使用中,并定期更新。
Sunbeam 通过去除有问题的低复杂度读段并标准化后处理和分析步骤,为在宏基因组测序实验中进行更深入的分析和实现比较提供了基础。Sunbeam 是使用 Snakemake 工作流管理软件用 Python 编写的,根据 GPLv3 协议在 github.com/sunbeam-labs/sunbeam 上免费提供。