• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

MetaGen:使用多个宏基因组样本进行无参考学习。

MetaGen: reference-free learning with multiple metagenomic samples.

机构信息

Department of Statistics, University of Georgia, Athens, 30602, GA, USA.

Department of Statistics, Harvard University, Cambridge, 02138, MA, USA.

出版信息

Genome Biol. 2017 Oct 3;18(1):187. doi: 10.1186/s13059-017-1323-y.

DOI:10.1186/s13059-017-1323-y
PMID:28974263
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5627425/
Abstract

A major goal of metagenomics is to identify and study the entire collection of microbial species in a set of targeted samples. We describe a statistical metagenomic algorithm that simultaneously identifies microbial species and estimates their abundances without using reference genomes. As a trade-off, we require multiple metagenomic samples, usually ≥10 samples, to get highly accurate binning results. Compared to reference-free methods based primarily on k-mer distributions or coverage information, the proposed approach achieves a higher species binning accuracy and is particularly powerful when sequencing coverage is low. We demonstrated the performance of this new method through both simulation and real metagenomic studies. The MetaGen software is available at https://github.com/BioAlgs/MetaGen .

摘要

宏基因组学的一个主要目标是识别和研究一组目标样本中微生物物种的全部集合。我们描述了一种统计宏基因组算法,该算法可以在不使用参考基因组的情况下同时识别微生物物种并估计它们的丰度。作为一种权衡,我们需要多个宏基因组样本,通常≥10 个样本,才能获得高度准确的分类结果。与主要基于 k-mer 分布或覆盖信息的无参考方法相比,该方法实现了更高的物种分类准确性,并且在测序覆盖度低时特别有效。我们通过模拟和真实的宏基因组研究展示了这种新方法的性能。MetaGen 软件可在 https://github.com/BioAlgs/MetaGen 获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c7b/5627425/a1d4c4fcc390/13059_2017_1323_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c7b/5627425/4914d0b67398/13059_2017_1323_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c7b/5627425/c822b7d2c099/13059_2017_1323_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c7b/5627425/a2d9cc530f23/13059_2017_1323_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c7b/5627425/3c4b53bf68c6/13059_2017_1323_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c7b/5627425/f7d2cb71a8ae/13059_2017_1323_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c7b/5627425/a461b120fa5b/13059_2017_1323_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c7b/5627425/5be2af4acb92/13059_2017_1323_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c7b/5627425/a1d4c4fcc390/13059_2017_1323_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c7b/5627425/4914d0b67398/13059_2017_1323_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c7b/5627425/c822b7d2c099/13059_2017_1323_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c7b/5627425/a2d9cc530f23/13059_2017_1323_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c7b/5627425/3c4b53bf68c6/13059_2017_1323_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c7b/5627425/f7d2cb71a8ae/13059_2017_1323_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c7b/5627425/a461b120fa5b/13059_2017_1323_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c7b/5627425/5be2af4acb92/13059_2017_1323_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c7b/5627425/a1d4c4fcc390/13059_2017_1323_Fig8_HTML.jpg

相似文献

1
MetaGen: reference-free learning with multiple metagenomic samples.MetaGen:使用多个宏基因组样本进行无参考学习。
Genome Biol. 2017 Oct 3;18(1):187. doi: 10.1186/s13059-017-1323-y.
2
A New Unsupervised Binning Approach for Metagenomic Sequences Based on N-grams and Automatic Feature Weighting.一种基于N元语法和自动特征加权的宏基因组序列无监督分箱新方法。
IEEE/ACM Trans Comput Biol Bioinform. 2014 Jan-Feb;11(1):42-54. doi: 10.1109/TCBB.2013.137.
3
CoMet: a workflow using contig coverage and composition for binning a metagenomic sample with high precision.CoMet:一种使用 contig 覆盖度和组成进行宏基因组样本高精度分箱的工作流程。
BMC Bioinformatics. 2017 Dec 28;18(Suppl 16):571. doi: 10.1186/s12859-017-1967-3.
4
Estimating the total genome length of a metagenomic sample using k-mers.利用 k- -mer 估算宏基因组样本的总基因组长度。
BMC Genomics. 2019 Apr 4;20(Suppl 2):183. doi: 10.1186/s12864-019-5467-x.
5
Evaluating metagenomics tools for genome binning with real metagenomic datasets and CAMI datasets.评估宏基因组工具在真实宏基因组数据集和 CAMI 数据集上的基因组 binning 效果。
BMC Bioinformatics. 2020 Jul 28;21(1):334. doi: 10.1186/s12859-020-03667-3.
6
Exploration and retrieval of whole-metagenome sequencing samples.全基因组测序样本的探索与检索。
Bioinformatics. 2014 Sep 1;30(17):2471-9. doi: 10.1093/bioinformatics/btu340. Epub 2014 May 19.
7
MetaBMF: a scalable binning algorithm for large-scale reference-free metagenomic studies.MetaBMF:一种用于大规模无参考宏基因组研究的可扩展分箱算法。
Bioinformatics. 2020 Jan 15;36(2):356-363. doi: 10.1093/bioinformatics/btz577.
8
Growth dynamics of gut microbiota in health and disease inferred from single metagenomic samples.从单一宏基因组样本推断健康与疾病状态下肠道微生物群的生长动态
Science. 2015 Sep 4;349(6252):1101-1106. doi: 10.1126/science.aac4812. Epub 2015 Jul 30.
9
MegaGTA: a sensitive and accurate metagenomic gene-targeted assembler using iterative de Bruijn graphs.MegaGTA:一种使用迭代德布鲁因图的灵敏且准确的宏基因组基因靶向组装器。
BMC Bioinformatics. 2017 Oct 16;18(Suppl 12):408. doi: 10.1186/s12859-017-1825-3.
10
MetaProb: accurate metagenomic reads binning based on probabilistic sequence signatures.MetaProb:基于概率序列特征的准确宏基因组 reads 分箱
Bioinformatics. 2016 Sep 1;32(17):i567-i575. doi: 10.1093/bioinformatics/btw466.

引用本文的文献

1
Binning Metagenomic Contigs Using Contig Embedding and Decomposed Tetranucleotide Frequency.利用重叠群嵌入和分解四核苷酸频率对宏基因组重叠群进行分箱
Biology (Basel). 2024 Sep 24;13(10):755. doi: 10.3390/biology13100755.
2
Human disease prediction from microbiome data by multiple feature fusion and deep learning.通过多特征融合和深度学习从微生物组数据预测人类疾病
iScience. 2022 Mar 16;25(4):104081. doi: 10.1016/j.isci.2022.104081. eCollection 2022 Apr 15.
3
Dicer-like proteins influence Arabidopsis root microbiota independent of RNA-directed DNA methylation.

本文引用的文献

1
Metagenome-Wide Association Study and Machine Learning Prediction of Bulk Soil Microbiome and Crop Productivity.土壤宏基因组关联研究与机器学习对土壤微生物群落和作物生产力的预测
Front Microbiol. 2017 Apr 3;8:519. doi: 10.3389/fmicb.2017.00519. eCollection 2017.
2
Machine Learning Meta-analysis of Large Metagenomic Datasets: Tools and Biological Insights.大型宏基因组数据集的机器学习荟萃分析:工具与生物学见解
PLoS Comput Biol. 2016 Jul 11;12(7):e1004977. doi: 10.1371/journal.pcbi.1004977. eCollection 2016 Jul.
3
MEGAHIT v1.0: A fast and scalable metagenome assembler driven by advanced methodologies and community practices.
Dicer-like 蛋白通过 RNA 导向的 DNA 甲基化影响拟南芥根际微生物组。
Microbiome. 2021 Feb 26;9(1):57. doi: 10.1186/s40168-020-00966-y.
4
Genome diversification in globally distributed novel marine Proteobacteria is linked to environmental adaptation.全球分布的新型海洋 Proteobacteria 中的基因组多样化与环境适应有关。
ISME J. 2020 Aug;14(8):2060-2077. doi: 10.1038/s41396-020-0669-4. Epub 2020 May 11.
5
The Landscape of Genetic Content in the Gut and Oral Human Microbiome.肠道和口腔人类微生物组中的遗传内容景观。
Cell Host Microbe. 2019 Aug 14;26(2):283-295.e8. doi: 10.1016/j.chom.2019.07.008.
6
MicroPro: using metagenomic unmapped reads to provide insights into human microbiota and disease associations.MicroPro:利用宏基因组未映射reads 提供对人类微生物组和疾病关联的深入了解。
Genome Biol. 2019 Aug 6;20(1):154. doi: 10.1186/s13059-019-1773-5.
7
MetaBMF: a scalable binning algorithm for large-scale reference-free metagenomic studies.MetaBMF:一种用于大规模无参考宏基因组研究的可扩展分箱算法。
Bioinformatics. 2020 Jan 15;36(2):356-363. doi: 10.1093/bioinformatics/btz577.
8
Identifying Sequences for Microbial Communities Using Long -mer Sequence Signatures.使用长序列特征识别微生物群落的序列
Front Microbiol. 2018 May 3;9:872. doi: 10.3389/fmicb.2018.00872. eCollection 2018.
MEGAHIT v1.0:一种由先进方法和社区实践驱动的快速且可扩展的宏基因组组装工具。
Methods. 2016 Jun 1;102:3-11. doi: 10.1016/j.ymeth.2016.02.020. Epub 2016 Mar 21.
4
Strain-level microbial epidemiology and population genomics from shotgun metagenomics.基于高通量宏基因组的菌株水平微生物流行病学和群体基因组学研究。
Nat Methods. 2016 May;13(5):435-8. doi: 10.1038/nmeth.3802. Epub 2016 Mar 21.
5
An evaluation of the accuracy and speed of metagenome analysis tools.宏基因组分析工具的准确性和速度评估。
Sci Rep. 2016 Jan 18;6:19233. doi: 10.1038/srep19233.
6
MaxBin 2.0: an automated binning algorithm to recover genomes from multiple metagenomic datasets.MaxBin 2.0:一种从多个宏基因组数据集中恢复基因组的自动分箱算法。
Bioinformatics. 2016 Feb 15;32(4):605-7. doi: 10.1093/bioinformatics/btv638. Epub 2015 Oct 29.
7
ConStrains identifies microbial strains in metagenomic datasets.ConStrains可识别宏基因组数据集中的微生物菌株。
Nat Biotechnol. 2015 Oct;33(10):1045-52. doi: 10.1038/nbt.3319. Epub 2015 Sep 7.
8
MetaBAT, an efficient tool for accurately reconstructing single genomes from complex microbial communities.MetaBAT是一种从复杂微生物群落中准确重建单个基因组的高效工具。
PeerJ. 2015 Aug 27;3:e1165. doi: 10.7717/peerj.1165. eCollection 2015.
9
Phylogenetically typing bacterial strains from partial SNP genotypes observed from direct sequencing of clinical specimen metagenomic data.根据从临床样本宏基因组数据直接测序观察到的部分单核苷酸多态性(SNP)基因型对细菌菌株进行系统发育分型。
Genome Med. 2015 Jun 9;7(1):52. doi: 10.1186/s13073-015-0176-9. eCollection 2015.
10
CLARK: fast and accurate classification of metagenomic and genomic sequences using discriminative k-mers.克拉克:使用判别性k-mer对宏基因组和基因组序列进行快速准确分类
BMC Genomics. 2015 Mar 25;16(1):236. doi: 10.1186/s12864-015-1419-2.