Chaabane Farid, Pillonel Trestan, Bertelli Claire
Institute of Microbiology, Lausanne University Hospital and University of Lausanne, Lausanne, 1011, Switzerland.
Bioinformatics. 2024 Dec 26;41(1). doi: 10.1093/bioinformatics/btae760.
The intrinsic complexity of the microbiota combined with technical variability render shotgun metagenomics challenging to analyze for routine clinical or research applications. In silico data generation offers a controlled environment allowing for example to benchmark bioinformatics tools, to optimize study design, statistical power, or to validate targeted applications. Here, we propose assembly_finder and the Metagenomic Sequence Simulator (MeSS), two easy-to-use Bioconda packages, as part of a benchmarking toolkit to download genomes and simulate shotgun metagenomics samples, respectively. Outperforming existing tools in speed while requiring less memory, MeSS reproducibly generates accurate complex communities based on a list of taxonomic ranks and their abundance.
All code is released under MIT License and is available on https://github.com/metagenlab/MeSS and https://github.com/metagenlab/assembly_finder.
微生物群的内在复杂性加上技术变异性,使得鸟枪法宏基因组学在常规临床或研究应用中的分析具有挑战性。计算机模拟数据生成提供了一个可控环境,例如可用于对生物信息学工具进行基准测试、优化研究设计、统计功效,或验证靶向应用。在这里,我们提出了assembly_finder和宏基因组序列模拟器(MeSS)这两个易于使用的Bioconda软件包,作为基准测试工具包的一部分,分别用于下载基因组和模拟鸟枪法宏基因组学样本。MeSS在速度上优于现有工具,同时所需内存更少,它能根据分类等级及其丰度列表可重复地生成准确的复杂群落。
所有代码均根据麻省理工学院许可发布,可在https://github.com/metagenlab/MeSS和https://github.com/metagenlab/assembly_finder上获取。