• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Tamock:宏基因组学中栖息地特异性基准数据的模拟。

Tamock: simulation of habitat-specific benchmark data in metagenomics.

机构信息

Division of Computational System Biology, Department of Microbiology and Ecosystem Science, University of Vienna, Vienna, Austria.

Department Bioengineering, University of Applied Sciences FH Campus Wien, Vienna, Austria.

出版信息

BMC Bioinformatics. 2021 May 1;22(1):227. doi: 10.1186/s12859-021-04154-z.

DOI:10.1186/s12859-021-04154-z
PMID:33932979
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8088724/
Abstract

BACKGROUND

Simulated metagenomic reads are widely used to benchmark software and workflows for metagenome interpretation. The results of metagenomic benchmarks depend on the assumptions about their underlying ecosystems. Conclusions from benchmark studies are therefore limited to the ecosystems they mimic. Ideally, simulations are therefore based on genomes, which resemble particular metagenomic communities realistically.

RESULTS

We developed Tamock to facilitate the realistic simulation of metagenomic reads according to a metagenomic community, based on real sequence data. Benchmarks samples can be created from all genomes and taxonomic domains present in NCBI RefSeq. Tamock automatically determines taxonomic profiles from shotgun sequence data, selects reference genomes accordingly and uses them to simulate metagenomic reads. We present an example use case for Tamock by assessing assembly and binning method performance for selected microbiomes.

CONCLUSIONS

Tamock facilitates automated simulation of habitat-specific benchmark metagenomic data based on real sequence data and is implemented as a user-friendly command-line application, providing extensive additional information along with the simulated benchmark data. Resulting benchmarks enable an assessment of computational methods, workflows, and parameters specifically for a metagenomic habitat or ecosystem of a metagenomic study.

AVAILABILITY

Source code, documentation and install instructions are freely available at GitHub ( https://github.com/gerners/tamock ).

摘要

背景

模拟宏基因组读段被广泛用于基准测试宏基因组解释的软件和工作流程。宏基因组基准测试的结果取决于对其潜在生态系统的假设。因此,基准研究的结论仅限于它们所模拟的生态系统。理想情况下,模拟应基于与真实宏基因组群落具有实际相似性的基因组。

结果

我们开发了 Tamock,以根据真实的序列数据,基于特定的宏基因组群落,实现宏基因组读段的真实模拟。Tamock 可以从 NCBI RefSeq 中存在的所有基因组和分类域中创建基准样本。Tamock 可以自动从鸟枪法序列数据中确定分类概况,相应地选择参考基因组,并使用它们来模拟宏基因组读段。我们通过评估所选微生物群落的组装和分类方法性能,展示了 Tamock 的一个示例用例。

结论

Tamock 可以方便地根据真实的序列数据,自动模拟特定栖息地的基准宏基因组数据,并作为用户友好的命令行应用程序实现,为模拟的基准数据提供了广泛的附加信息。生成的基准可以评估特定宏基因组生境或宏基因组研究的生态系统的计算方法、工作流程和参数。

可用性

源代码、文档和安装说明可在 GitHub(https://github.com/gerners/tamock)上免费获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3eac/8088724/a6fef094f4aa/12859_2021_4154_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3eac/8088724/006a9d385b9f/12859_2021_4154_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3eac/8088724/e4b352c0b0cc/12859_2021_4154_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3eac/8088724/85850c49d48f/12859_2021_4154_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3eac/8088724/a6fef094f4aa/12859_2021_4154_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3eac/8088724/006a9d385b9f/12859_2021_4154_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3eac/8088724/e4b352c0b0cc/12859_2021_4154_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3eac/8088724/85850c49d48f/12859_2021_4154_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3eac/8088724/a6fef094f4aa/12859_2021_4154_Fig4_HTML.jpg

相似文献

1
Tamock: simulation of habitat-specific benchmark data in metagenomics.Tamock:宏基因组学中栖息地特异性基准数据的模拟。
BMC Bioinformatics. 2021 May 1;22(1):227. doi: 10.1186/s12859-021-04154-z.
2
CAMISIM: simulating metagenomes and microbial communities.CAMISIM:模拟宏基因组和微生物群落。
Microbiome. 2019 Feb 8;7(1):17. doi: 10.1186/s40168-019-0633-6.
3
MetaBCC-LR: metagenomics binning by coverage and composition for long reads.MetaBCC-LR:基于覆盖度和组成的长读长宏基因组 bin 划分。
Bioinformatics. 2020 Jul 1;36(Suppl_1):i3-i11. doi: 10.1093/bioinformatics/btaa441.
4
MetaWRAP-a flexible pipeline for genome-resolved metagenomic data analysis.MetaWRAP-一个用于基因组解析宏基因组数据分析的灵活管道。
Microbiome. 2018 Sep 15;6(1):158. doi: 10.1186/s40168-018-0541-1.
5
Assessment of metagenomic assemblers based on hybrid reads of real and simulated metagenomic sequences.基于真实和模拟宏基因组序列混合读取的宏基因组组装器评估。
Brief Bioinform. 2020 May 21;21(3):777-790. doi: 10.1093/bib/bbz025.
6
STEMSIM: a simulator of within-strain short-term evolutionary mutations for longitudinal metagenomic data.STEMSIM:用于纵向宏基因组数据的菌株内短期进化突变的模拟器。
Bioinformatics. 2023 May 4;39(5). doi: 10.1093/bioinformatics/btad302.
7
HiFine: integrating Hi-C-based and shotgun-based methods to refine binning of metagenomic contigs.HiFine:整合基于 Hi-C 和 shotgun 的方法来优化宏基因组 contigs 的 bin 划分。
Bioinformatics. 2022 May 26;38(11):2973-2979. doi: 10.1093/bioinformatics/btac295.
8
LMAS: evaluating metagenomic short de novo assembly methods through defined communities.LMAS:通过定义的群落评估宏基因组短从头组装方法。
Gigascience. 2022 Dec 28;12. doi: 10.1093/gigascience/giac122.
9
Characterization and simulation of metagenomic nanopore sequencing data with Meta-NanoSim.利用 Meta-NanoSim 对宏基因组纳米孔测序数据进行特征描述和模拟。
Gigascience. 2023 Mar 20;12. doi: 10.1093/gigascience/giad013.
10
GraphBin: refined binning of metagenomic contigs using assembly graphs.GraphBin:使用组装图对宏基因组序列进行精细化分箱。
Bioinformatics. 2020 Jun 1;36(11):3307-3313. doi: 10.1093/bioinformatics/btaa180.

引用本文的文献

1
MAGICIAN: MAG simulation for investigating criteria for bioinformatic analysis.魔术师:用于研究生物信息学分析标准的 MAG 模拟。
BMC Genomics. 2024 Jan 12;25(1):55. doi: 10.1186/s12864-023-09912-2.

本文引用的文献

1
Assessing the performance of different approaches for functional and taxonomic annotation of metagenomes.评估不同方法在宏基因组功能和分类注释方面的性能。
BMC Genomics. 2019 Dec 10;20(1):960. doi: 10.1186/s12864-019-6289-6.
2
The Integrative Human Microbiome Project.整合人类微生物组计划。
Nature. 2019 May;569(7758):641-648. doi: 10.1038/s41586-019-1238-8. Epub 2019 May 29.
3
CAMISIM: simulating metagenomes and microbial communities.CAMISIM:模拟宏基因组和微生物群落。
Microbiome. 2019 Feb 8;7(1):17. doi: 10.1186/s40168-019-0633-6.
4
Extensive Unexplored Human Microbiome Diversity Revealed by Over 150,000 Genomes from Metagenomes Spanning Age, Geography, and Lifestyle.从来自不同年龄、地理和生活方式的宏基因组中超过 15 万条基因组揭示了广泛未被探索的人类微生物组多样性。
Cell. 2019 Jan 24;176(3):649-662.e20. doi: 10.1016/j.cell.2019.01.001. Epub 2019 Jan 17.
5
Assessment of urban microbiome assemblies with the help of targeted in silico gold standards.借助靶向计算金标准评估城市微生物组组装。
Biol Direct. 2018 Oct 12;13(1):22. doi: 10.1186/s13062-018-0225-6.
6
Critical Assessment of Metagenome Interpretation-a benchmark of metagenomics software.宏基因组解读的批判性评估——宏基因组学软件的一项基准测试
Nat Methods. 2017 Nov;14(11):1063-1071. doi: 10.1038/nmeth.4458. Epub 2017 Oct 2.
7
Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea.细菌和古菌单扩增基因组(MISAG)及宏基因组组装基因组(MIMAG)的最低信息要求
Nat Biotechnol. 2017 Aug 8;35(8):725-731. doi: 10.1038/nbt.3893.
8
metaSPAdes: a new versatile metagenomic assembler.metaSPAdes:一种新型通用宏基因组序列拼接软件
Genome Res. 2017 May;27(5):824-834. doi: 10.1101/gr.213959.116. Epub 2017 Mar 15.
9
Centrifuge: rapid and sensitive classification of metagenomic sequences.离心机:宏基因组序列的快速灵敏分类
Genome Res. 2016 Dec;26(12):1721-1729. doi: 10.1101/gr.210641.116. Epub 2016 Oct 17.
10
Urban Transit System Microbial Communities Differ by Surface Type and Interaction with Humans and the Environment.城市交通系统微生物群落因表面类型以及与人类和环境的相互作用而有所不同。
mSystems. 2016 Jun 28;1(3). doi: 10.1128/mSystems.00018-16. eCollection 2016 May-Jun.