Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany.
Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China.
Microbiome. 2019 Jun 3;7(1):84. doi: 10.1186/s40168-019-0684-8.
BACKGROUND: Shotgun metagenomes contain a sample of all the genomic material in an environment, allowing for the characterization of a microbial community. In order to understand these communities, bioinformatics methods are crucial. A common first step in processing metagenomes is to compute abundance estimates of different taxonomic or functional groups from the raw sequencing data. Given the breadth of the field, computational solutions need to be flexible and extensible, enabling the combination of different tools into a larger pipeline. RESULTS: We present NGLess and NG-meta-profiler. NGLess is a domain specific language for describing next-generation sequence processing pipelines. It was developed with the goal of enabling user-friendly computational reproducibility. It provides built-in support for many common operations on sequencing data and is extensible with external tools with configuration files. Using this framework, we developed NG-meta-profiler, a fast profiler for metagenomes which performs sequence preprocessing, mapping to bundled databases, filtering of the mapping results, and profiling (taxonomic and functional). It is significantly faster than either MOCAT2 or htseq-count and (as it builds on NGLess) its results are perfectly reproducible. CONCLUSIONS: NG-meta-profiler is a high-performance solution for metagenomics processing built on NGLess. It can be used as-is to execute standard analyses or serve as the starting point for customization in a perfectly reproducible fashion. NGLess and NG-meta-profiler are open source software (under the liberal MIT license) and can be downloaded from https://ngless.embl.de or installed through bioconda.
背景: shotgun 宏基因组包含环境中所有基因组物质的样本,可用于描述微生物群落。为了理解这些群落,生物信息学方法是至关重要的。处理宏基因组的常见第一步是根据原始测序数据计算不同分类或功能组的丰度估计值。鉴于该领域的广泛性,计算解决方案需要具有灵活性和可扩展性,能够将不同的工具组合到一个更大的管道中。
结果: 我们提出了 NGLess 和 NG-meta-profiler。NGLess 是一种用于描述下一代测序处理管道的特定领域语言。它的开发目标是实现用户友好的计算可重复性。它为测序数据的许多常见操作提供了内置支持,并可通过配置文件使用外部工具进行扩展。使用这个框架,我们开发了 NG-meta-profiler,这是一种用于宏基因组的快速分析器,它执行序列预处理、与捆绑数据库的映射、映射结果的过滤以及分析(分类和功能)。它的速度明显快于 MOCAT2 或 htseq-count,并且(由于它是基于 NGLess 构建的)其结果是完全可重复的。
结论: NG-meta-profiler 是一种基于 NGLess 的高性能宏基因组学处理解决方案。它可以直接用于执行标准分析,也可以作为定制的起点,以完全可重复的方式进行定制。NGLess 和 NG-meta-profiler 是开源软件(根据宽松的 MIT 许可证),可从 https://ngless.embl.de 下载或通过 bioconda 安装。
Microbiome. 2017-8-14
BMC Genomics. 2016-10-25
BMC Bioinformatics. 2015-2-28
J Microbiol Methods. 2020-3
J Microbiol Methods. 2018-8
Animals (Basel). 2025-5-30
Nat Commun. 2025-1-14
Genomics Proteomics Bioinformatics. 2024-10-15
Nat Commun. 2024-8-31
Nat Commun. 2019-3-4
Bioinformatics. 2018-9-15
Nat Med. 2018-4-10
PLoS Comput Biol. 2018-1-4