Systems Biology Department, Centro Nacional de Biotecnología (CNB-CSIC), C/ Darwin n° 3, Campus de Cantoblanco, 28049, Madrid, Spain.
BMC Bioinformatics. 2020 Aug 14;21(1):358. doi: 10.1186/s12859-020-03703-2.
The dramatic decrease in sequencing costs over the last decade has boosted the adoption of high-throughput sequencing applications as a standard tool for the analysis of environmental microbial communities. Nowadays even small research groups can easily obtain raw sequencing data. After that, however, non-specialists are faced with the double challenge of choosing among an ever-increasing array of analysis methodologies, and navigating the vast amounts of results returned by these approaches.
Here we present a workflow that relies on the SqueezeMeta software for the automated processing of raw reads into annotated contigs and reconstructed genomes (bins). A set of custom scripts seamlessly integrates the output into the anvi'o analysis platform, allowing filtering and visual exploration of the results. Furthermore, we provide a software package with utility functions to expose the SqueezeMeta results to the R analysis environment.
Altogether, our workflow allows non-expert users to go from raw sequencing reads to custom plots with only a few powerful, flexible and well-documented commands.
过去十年中,测序成本的大幅下降推动了高通量测序应用作为分析环境微生物群落的标准工具的采用。如今,即使是小型研究小组也可以轻松获得原始测序数据。然而,在那之后,非专业人员面临着在越来越多的分析方法中进行选择的双重挑战,以及应对这些方法返回的大量结果。
在这里,我们展示了一个依赖于 SqueezeMeta 软件的工作流程,用于将原始读取自动处理为注释的连续体和重建的基因组(bins)。一组自定义脚本将输出无缝集成到 anvi'o 分析平台中,允许对结果进行过滤和可视化探索。此外,我们提供了一个带有实用功能的软件包,将 SqueezeMeta 结果暴露给 R 分析环境。
总的来说,我们的工作流程允许非专业用户仅使用几个强大、灵活且有良好文档记录的命令,从原始测序读取到自定义图。