• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于计算的宏基因组工具编码序列检测的基准测试揭示了灵敏度和精度的局限性。

In silico benchmarking of metagenomic tools for coding sequence detection reveals the limits of sensitivity and precision.

机构信息

Infectious Diseases, Internal Medicine, Michigan Medicine, University of Michigan, Ann Arbor, MI, USA.

Microbiome Research Initiative, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave N, E4-100, Seattle, WA, 98109-1024, USA.

出版信息

BMC Bioinformatics. 2020 Oct 15;21(1):459. doi: 10.1186/s12859-020-03802-0.

DOI:10.1186/s12859-020-03802-0
PMID:33059593
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7559173/
Abstract

BACKGROUND

High-throughput sequencing can establish the functional capacity of a microbial community by cataloging the protein-coding sequences (CDS) present in the metagenome of the community. The relative performance of different computational methods for identifying CDS from whole-genome shotgun sequencing is not fully established.

RESULTS

Here we present an automated benchmarking workflow, using synthetic shotgun sequencing reads for which we know the true CDS content of the underlying communities, to determine the relative performance (sensitivity, positive predictive value or PPV, and computational efficiency) of different metagenome analysis tools for extracting the CDS content of a microbial community. Assembly-based methods are limited by coverage depth, with poor sensitivity for CDS at < 5X depth of sequencing, but have excellent PPV. Mapping-based techniques are more sensitive at low coverage depths, but can struggle with PPV. We additionally describe an expectation maximization based iterative algorithmic approach which we show to successfully improve the PPV of a mapping based technique while retaining improved sensitivity and computational efficiency.

CONCLUSION

Our benchmarking approach reveals the trade-offs of assembly versus alignment-based approaches and the relative performance of specific implementations when one wishes to extract the protein coding capacity of microbial communities.

摘要

背景

高通量测序可以通过对群落宏基因组中存在的蛋白编码序列(CDS)进行编目,从而建立微生物群落的功能能力。不同计算方法在识别全基因组鸟枪法测序中 CDS 的相对性能尚未完全确定。

结果

在这里,我们提出了一种自动化的基准测试工作流程,使用我们知道潜在群落中真实 CDS 内容的合成鸟枪法测序reads,以确定不同宏基因组分析工具提取微生物群落 CDS 内容的相对性能(灵敏度、阳性预测值或 PPV 和计算效率)。基于组装的方法受覆盖深度限制,对于测序深度 < 5X 的 CDS 灵敏度较差,但 PPV 非常高。基于映射的技术在低覆盖深度下更敏感,但可能难以获得 PPV。我们还描述了一种基于期望最大化的迭代算法方法,我们证明该方法可以成功提高基于映射的技术的 PPV,同时保持改进的灵敏度和计算效率。

结论

我们的基准测试方法揭示了组装与基于比对的方法之间的权衡,以及当希望提取微生物群落的蛋白质编码能力时,特定实现的相对性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b038/7559173/5491062d001a/12859_2020_3802_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b038/7559173/0ef14c6a208f/12859_2020_3802_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b038/7559173/eff178ae618d/12859_2020_3802_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b038/7559173/5491062d001a/12859_2020_3802_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b038/7559173/0ef14c6a208f/12859_2020_3802_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b038/7559173/eff178ae618d/12859_2020_3802_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b038/7559173/5491062d001a/12859_2020_3802_Fig3_HTML.jpg

相似文献

1
In silico benchmarking of metagenomic tools for coding sequence detection reveals the limits of sensitivity and precision.基于计算的宏基因组工具编码序列检测的基准测试揭示了灵敏度和精度的局限性。
BMC Bioinformatics. 2020 Oct 15;21(1):459. doi: 10.1186/s12859-020-03802-0.
2
Benchmarking genome assembly methods on metagenomic sequencing data.基于宏基因组测序数据对基因组组装方法进行基准测试。
Brief Bioinform. 2023 Mar 19;24(2). doi: 10.1093/bib/bbad087.
3
Performance Characteristics of Next-Generation Sequencing for the Detection of Antimicrobial Resistance Determinants in Escherichia coli Genomes and Metagenomes.下一代测序技术在检测大肠杆菌基因组和宏基因组中抗菌药物耐药决定因子的性能特征。
mSystems. 2022 Jun 28;7(3):e0002222. doi: 10.1128/msystems.00022-22. Epub 2022 Jun 1.
4
Species classifier choice is a key consideration when analysing low-complexity food microbiome data.在分析低复杂度食品微生物组数据时,物种分类器的选择是一个关键考虑因素。
Microbiome. 2018 Mar 20;6(1):50. doi: 10.1186/s40168-018-0437-0.
5
Assessment of urban microbiome assemblies with the help of targeted in silico gold standards.借助靶向计算金标准评估城市微生物组组装。
Biol Direct. 2018 Oct 12;13(1):22. doi: 10.1186/s13062-018-0225-6.
6
MinION™ nanopore sequencing of environmental metagenomes: a synthetic approach.环境宏基因组的MinION™纳米孔测序:一种合成方法。
Gigascience. 2017 Mar 1;6(3):1-10. doi: 10.1093/gigascience/gix007.
7
Evaluation of taxonomic classification and profiling methods for long-read shotgun metagenomic sequencing datasets.评价长读 shotgun 宏基因组测序数据集的分类和分析方法。
BMC Bioinformatics. 2022 Dec 13;23(1):541. doi: 10.1186/s12859-022-05103-0.
8
Tamock: simulation of habitat-specific benchmark data in metagenomics.Tamock:宏基因组学中栖息地特异性基准数据的模拟。
BMC Bioinformatics. 2021 May 1;22(1):227. doi: 10.1186/s12859-021-04154-z.
9
CAMISIM: simulating metagenomes and microbial communities.CAMISIM:模拟宏基因组和微生物群落。
Microbiome. 2019 Feb 8;7(1):17. doi: 10.1186/s40168-019-0633-6.
10
MAGICIAN: MAG simulation for investigating criteria for bioinformatic analysis.魔术师:用于研究生物信息学分析标准的 MAG 模拟。
BMC Genomics. 2024 Jan 12;25(1):55. doi: 10.1186/s12864-023-09912-2.

引用本文的文献

1
Acarbose impairs gut growth by targeting intracellular glucosidases.阿卡波糖通过作用于细胞内糖苷酶来损害肠道生长。
mBio. 2024 Dec 11;15(12):e0150624. doi: 10.1128/mbio.01506-24. Epub 2024 Nov 20.
2
Acarbose Impairs Gut Growth by Targeting Intracellular GH97 Enzymes.阿卡波糖通过靶向细胞内GH97酶损害肠道生长。
bioRxiv. 2024 May 23:2024.05.20.595031. doi: 10.1101/2024.05.20.595031.
3
A pilot study of the use of the oral and faecal microbiota for the diagnosis of ulcerative colitis and Crohn's disease in a paediatric population.

本文引用的文献

1
Clustering co-abundant genes identifies components of the gut microbiome that are reproducibly associated with colorectal cancer and inflammatory bowel disease.聚类共丰度基因可识别与结直肠癌和炎症性肠病有重现性关联的肠道微生物组的组成部分。
Microbiome. 2019 Aug 1;7(1):110. doi: 10.1186/s40168-019-0722-6.
2
Critical Assessment of Metagenome Interpretation-a benchmark of metagenomics software.宏基因组解读的批判性评估——宏基因组学软件的一项基准测试
Nat Methods. 2017 Nov;14(11):1063-1071. doi: 10.1038/nmeth.4458. Epub 2017 Oct 2.
3
Comprehensive benchmarking and ensemble approaches for metagenomic classifiers.
一项关于利用口腔和粪便微生物群诊断儿科溃疡性结肠炎和克罗恩病的试点研究。
Front Pediatr. 2023 Nov 16;11:1220976. doi: 10.3389/fped.2023.1220976. eCollection 2023.
4
MaLiAmPi enables generalizable and taxonomy-independent microbiome features from technically diverse 16S-based microbiome studies.MaLiAmPi 能够从技术上多样化的基于 16S 的微生物组研究中提取可推广且与分类无关的微生物组特征。
Cell Rep Methods. 2023 Nov 20;3(11):100639. doi: 10.1016/j.crmeth.2023.100639. Epub 2023 Nov 7.
5
geneshot: gene-level metagenomics identifies genome islands associated with immunotherapy response.基因枪法:基于基因水平的宏基因组学鉴定与免疫治疗反应相关的基因组岛。
Genome Biol. 2021 May 5;22(1):135. doi: 10.1186/s13059-021-02355-6.
6
Signal Versus Noise: How to Analyze the Microbiome and Make Progress on Antimicrobial Resistance.信号与噪声:如何分析微生物组并在抗菌耐药性方面取得进展。
J Infect Dis. 2021 Jun 16;223(12 Suppl 2):S214-S221. doi: 10.1093/infdis/jiab184.
元基因组分类器的综合基准测试和集成方法。
Genome Biol. 2017 Sep 21;18(1):182. doi: 10.1186/s13059-017-1299-7.
4
Shotgun metagenomics, from sampling to analysis. shotgun 宏基因组学,从采样到分析。
Nat Biotechnol. 2017 Sep 12;35(9):833-844. doi: 10.1038/nbt.3935.
5
Evaluating the accuracy of amplicon-based microbiome computational pipelines on simulated human gut microbial communities.评估基于扩增子的微生物组计算流程在模拟人类肠道微生物群落上的准确性。
BMC Bioinformatics. 2017 May 30;18(1):283. doi: 10.1186/s12859-017-1690-0.
6
MGmapper: Reference based mapping and taxonomy annotation of metagenomics sequence reads.MGmapper:宏基因组序列 reads 的基于参考的映射和分类注释
PLoS One. 2017 May 3;12(5):e0176469. doi: 10.1371/journal.pone.0176469. eCollection 2017.
7
Nextflow enables reproducible computational workflows.Nextflow支持可重复的计算工作流程。
Nat Biotechnol. 2017 Apr 11;35(4):316-319. doi: 10.1038/nbt.3820.
8
An integrated metagenomics pipeline for strain profiling reveals novel patterns of bacterial transmission and biogeography.一种用于菌株分析的综合宏基因组学流程揭示了细菌传播和生物地理学的新模式。
Genome Res. 2016 Nov;26(11):1612-1625. doi: 10.1101/gr.201863.115. Epub 2016 Oct 18.
9
An evaluation of the accuracy and speed of metagenome analysis tools.宏基因组分析工具的准确性和速度评估。
Sci Rep. 2016 Jan 18;6:19233. doi: 10.1038/srep19233.
10
UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches.UniRef聚类:一种用于改进序列相似性搜索的全面且可扩展的替代方法。
Bioinformatics. 2015 Mar 15;31(6):926-32. doi: 10.1093/bioinformatics/btu739. Epub 2014 Nov 13.