CAMISIM：模拟宏基因组和微生物群落。

CAMISIM: simulating metagenomes and microbial communities.

机构信息

Computational Biology of Infection Research, Helmholtz Centre for Infection Research, Braunschweig, 38124, Germany.

Formerly Department of Algorithmic Bioinformatics, Heinrich-Heine University Düsseldorf, Düsseldorf, 40225, Germany.

出版信息

Microbiome. 2019 Feb 8;7(1):17. doi: 10.1186/s40168-019-0633-6.

DOI:10.1186/s40168-019-0633-6

PMID:30736849

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6368784/

Abstract

BACKGROUND

Shotgun metagenome data sets of microbial communities are highly diverse, not only due to the natural variation of the underlying biological systems, but also due to differences in laboratory protocols, replicate numbers, and sequencing technologies. Accordingly, to effectively assess the performance of metagenomic analysis software, a wide range of benchmark data sets are required.

RESULTS

We describe the CAMISIM microbial community and metagenome simulator. The software can model different microbial abundance profiles, multi-sample time series, and differential abundance studies, includes real and simulated strain-level diversity, and generates second- and third-generation sequencing data from taxonomic profiles or de novo. Gold standards are created for sequence assembly, genome binning, taxonomic binning, and taxonomic profiling. CAMSIM generated the benchmark data sets of the first CAMI challenge. For two simulated multi-sample data sets of the human and mouse gut microbiomes, we observed high functional congruence to the real data. As further applications, we investigated the effect of varying evolutionary genome divergence, sequencing depth, and read error profiles on two popular metagenome assemblers, MEGAHIT, and metaSPAdes, on several thousand small data sets generated with CAMISIM.

CONCLUSIONS

CAMISIM can simulate a wide variety of microbial communities and metagenome data sets together with standards of truth for method evaluation. All data sets and the software are freely available at https://github.com/CAMI-challenge/CAMISIM.

摘要

背景

微生物群落的 shotgun 宏基因组数据集高度多样化，不仅由于基础生物系统的自然变化，还由于实验室方案、重复数量和测序技术的差异。因此，为了有效地评估宏基因组分析软件的性能，需要广泛的基准数据集。

结果

我们描述了 CAMISIM 微生物群落和宏基因组模拟器。该软件可以模拟不同的微生物丰度分布、多样本时间序列和差异丰度研究，包括真实和模拟的菌株水平多样性，并根据分类分布或从头生成二级和三级测序数据。为序列组装、基因组分类、分类分类和分类分析创建了黄金标准。CAMSIM 生成了第一届 CAMI 挑战赛的基准数据集。对于人类和小鼠肠道微生物组的两个模拟多样本数据集，我们观察到与真实数据高度的功能一致性。作为进一步的应用，我们研究了在数千个使用 CAMISIM 生成的小数据集上，变化的进化基因组差异、测序深度和读取错误分布对两个流行的宏基因组组装器 MEGAHIT 和 metaSPAdes 的影响。

结论

CAMISIM 可以模拟各种微生物群落和宏基因组数据集，并为方法评估提供真实标准。所有数据集和软件均可在 https://github.com/CAMI-challenge/CAMISIM 上免费获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b0c/6368784/841cf7054d2a/40168_2019_633_Fig1_HTML.jpg

相似文献

CAMISIM: simulating metagenomes and microbial communities.CAMISIM：模拟宏基因组和微生物群落。

Microbiome. 2019 Feb 8;7(1):17. doi: 10.1186/s40168-019-0633-6.

Evaluating Assembly and Binning Strategies for Time Series Drinking Water Metagenomes.评估时间序列饮用水宏基因组的组装和分类策略。

Microbiol Spectr. 2021 Dec 22;9(3):e0143421. doi: 10.1128/Spectrum.01434-21. Epub 2021 Nov 3.

Practical evaluation of 11 de novo assemblers in metagenome assembly.宏基因组组装中11种从头组装程序的实际评估

J Microbiol Methods. 2018 Aug;151:99-105. doi: 10.1016/j.mimet.2018.06.007. Epub 2018 Jun 25.

Tamock: simulation of habitat-specific benchmark data in metagenomics.Tamock：宏基因组学中栖息地特异性基准数据的模拟。

BMC Bioinformatics. 2021 May 1;22(1):227. doi: 10.1186/s12859-021-04154-z.

Optimizing and evaluating the reconstruction of Metagenome-assembled microbial genomes.优化和评估宏基因组组装微生物基因组的重建。

BMC Genomics. 2017 Nov 28;18(1):915. doi: 10.1186/s12864-017-4294-1.

Tutorial: assessing metagenomics software with the CAMI benchmarking toolkit.教程：使用 CAMI 基准测试工具包评估宏基因组学软件。

Nat Protoc. 2021 Apr;16(4):1785-1801. doi: 10.1038/s41596-020-00480-3. Epub 2021 Mar 1.

Evaluating metagenomics tools for genome binning with real metagenomic datasets and CAMI datasets.评估宏基因组工具在真实宏基因组数据集和 CAMI 数据集上的基因组 binning 效果。

BMC Bioinformatics. 2020 Jul 28;21(1):334. doi: 10.1186/s12859-020-03667-3.

Characterization and simulation of metagenomic nanopore sequencing data with Meta-NanoSim.利用 Meta-NanoSim 对宏基因组纳米孔测序数据进行特征描述和模拟。

Gigascience. 2023 Mar 20;12. doi: 10.1093/gigascience/giad013.

PanFP: pangenome-based functional profiles for microbial communities.PanFP：基于全基因组的微生物群落功能概况

BMC Res Notes. 2015 Sep 26;8:479. doi: 10.1186/s13104-015-1462-8.

MetaWRAP-a flexible pipeline for genome-resolved metagenomic data analysis.MetaWRAP-一个用于基因组解析宏基因组数据分析的灵活管道。

Microbiome. 2018 Sep 15;6(1):158. doi: 10.1186/s40168-018-0541-1.

引用本文的文献

DeepMobilome: predicting mobile genetic elements using sequencing reads of microbiomes.深度移动基因组：利用微生物群落测序读数预测移动遗传元件

Brief Bioinform. 2025 Sep 6;26(5). doi: 10.1093/bib/bbaf450.

BugBuster: a novel automatic and reproducible workflow for metagenomic data analysis.BugBuster：一种用于宏基因组数据分析的新型自动化且可重复的工作流程。

Bioinform Adv. 2025 Jun 26;5(1):vbaf152. doi: 10.1093/bioadv/vbaf152. eCollection 2025.

Advancing metagenomic classification with NABAS+: a novel alignment-based approach.使用NABAS+推进宏基因组分类：一种基于比对的新方法。

NAR Genom Bioinform. 2025 Jul 4;7(3):lqaf092. doi: 10.1093/nargab/lqaf092. eCollection 2025 Sep.

SetBERT: the deep learning platform for contextualized embeddings and explainable predictions from high-throughput sequencing.SetBERT：用于从高通量测序中进行上下文嵌入和可解释预测的深度学习平台。

Bioinformatics. 2025 Jul 1;41(7). doi: 10.1093/bioinformatics/btaf370.

Precise and scalable metagenomic profiling with sample-tailored minimizer libraries.使用样本定制的最小化子库进行精确且可扩展的宏基因组分析。

NAR Genom Bioinform. 2025 Jun 9;7(2):lqaf076. doi: 10.1093/nargab/lqaf076. eCollection 2025 Jun.

Comparison of Three DNA Isolation Methods and Two Sequencing Techniques for the Study of the Human Microbiota.三种用于人类微生物群研究的DNA提取方法和两种测序技术的比较

Life (Basel). 2025 Apr 4;15(4):599. doi: 10.3390/life15040599.

A review of neural networks for metagenomic binning.宏基因组分箱的神经网络综述。

Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf065.

Semisynthetic simulation for microbiome data analysis.用于微生物组数据分析的半合成模拟

Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbaf051.

Computational Study Protocol: Leveraging Synthetic Data to Validate a Benchmark Study for Differential Abundance Tests for 16S Microbiome Sequencing Data.计算研究方案：利用合成数据验证16S微生物组测序数据差异丰度测试的基准研究

F1000Res. 2025 Jan 2;13:1180. doi: 10.12688/f1000research.155230.2. eCollection 2024.

MeStanG-Resource for High-Throughput Sequencing Standard Data Sets Generation for Bioinformatic Methods Evaluation and Validation.MeStanG - 用于生物信息学方法评估和验证的高通量测序标准数据集生成资源。

Biology (Basel). 2025 Jan 14;14(1):69. doi: 10.3390/biology14010069.

本文引用的文献

Critical Assessment of Metagenome Interpretation Enters the Second Round.宏基因组解读的批判性评估进入第二轮。

mSystems. 2018 Jul 10;3(4). doi: 10.1128/mSystems.00103-18. eCollection 2018 Jul-Aug.

A communal catalogue reveals Earth's multiscale microbial diversity.一份公共目录揭示了地球的多尺度微生物多样性。

Nature. 2017 Nov 23;551(7681):457-463. doi: 10.1038/nature24621. Epub 2017 Nov 1.

Distinct Microbial Communities Trigger Colitis Development upon Intestinal Barrier Damage via Innate or Adaptive Immune Cells.不同的微生物群落通过先天或适应性免疫细胞在肠道屏障损伤时引发结肠炎的发生。

Cell Rep. 2017 Oct 24;21(4):994-1008. doi: 10.1016/j.celrep.2017.09.097.

Ecogenomics of virophages and their giant virus hosts assessed through time series metagenomics.通过时间序列宏基因组学评估噬病毒体及其巨型病毒宿主的生态基因组学。

Nat Commun. 2017 Oct 11;8(1):858. doi: 10.1038/s41467-017-01086-2.

Critical Assessment of Metagenome Interpretation-a benchmark of metagenomics software.宏基因组解读的批判性评估——宏基因组学软件的一项基准测试

Nat Methods. 2017 Nov;14(11):1063-1071. doi: 10.1038/nmeth.4458. Epub 2017 Oct 2.

DESMAN: a new tool for de novo extraction of strains from metagenomes.DESMAN：一种从宏基因组中从头提取菌株的新工具。

Genome Biol. 2017 Sep 21;18(1):181. doi: 10.1186/s13059-017-1309-9.

Shotgun metagenomics, from sampling to analysis. shotgun 宏基因组学，从采样到分析。

Nat Biotechnol. 2017 Sep 12;35(9):833-844. doi: 10.1038/nbt.3935.

Metagenomics and CAZyme Discovery.宏基因组学与碳水化合物活性酶的发现

Methods Mol Biol. 2017;1588:255-277. doi: 10.1007/978-1-4939-6899-2_20.

NanoSim: nanopore sequence read simulator based on statistical characterization.NanoSim：基于统计特征的纳米孔序列读取模拟器。

Gigascience. 2017 Apr 1;6(4):1-6. doi: 10.1093/gigascience/gix010.

metaSPAdes: a new versatile metagenomic assembler.metaSPAdes：一种新型通用宏基因组序列拼接软件

Genome Res. 2017 May;27(5):824-834. doi: 10.1101/gr.213959.116. Epub 2017 Mar 15.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

CAMISIM：模拟宏基因组和微生物群落。

CAMISIM: simulating metagenomes and microbial communities.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献