从非常小的宏基因组序列样本中我们可以看到什么。

What we can see from very small size sample of metagenomic sequences.

机构信息

Graduate Program in Technology Policy, Yonsei University, 50 Yonsei Ro, Seodaemun Gu, Seoul, 038722, South Korea.

School of Civil and Environmental Engineering, Yonsei University, 50 Yonsei Ro, Seodaemun Gu, Seoul, 038722, South Korea.

出版信息

BMC Bioinformatics. 2018 Nov 3;19(1):399. doi: 10.1186/s12859-018-2431-8.

DOI:10.1186/s12859-018-2431-8

PMID:30390617

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6215618/

Abstract

BACKGROUND

Since the analysis of a large number of metagenomic sequences costs heavy computing resources and takes long time, we examined a selected small part of metagenomic sequences as "sample"s of the entire full sequences, both for a mock community and for 10 different existing metagenomics case studies. A mock community with 10 bacterial strains was prepared, and their mixed genome were sequenced by Hiseq. The hits of BLAST search for reference genome of each strain were counted. Each of 176 different small parts selected from these sequences were also searched by BLAST and their hits were also counted, in order to compare them to the original search results from the full sequences. We also prepared small parts of sequences which were selected from 10 publicly downloadable research data of MG-RAST service, and analyzed these samples with MG-RAST.

RESULTS

Both the BLAST search tests of the mock community and the results from the publicly downloadable researches of MG-RAST show that sampling an extremely small part from sequence data is useful to estimate brief taxonomic information of the original metagenomic sequences. For 9 cases out of 10, the most annotated classes from the MG-RAST analyses of the selected partial sample sequences are the same as the ones from the originals.

CONCLUSIONS

When a researcher wants to estimate brief information of a metagenome's taxonomic distribution with less computing resources and within shorter time, the researcher can analyze a selected small part of metagenomic sequences. With this approach, we can also build a strategy to monitor metagenome samples of wider geographic area, more frequently.

摘要

背景

由于对大量宏基因组序列进行分析需要大量的计算资源和时间，因此我们选择了宏基因组序列的一小部分作为“样本”，对模拟群落和 10 个不同的现有宏基因组案例研究进行了分析。我们准备了一个包含 10 个细菌菌株的模拟群落，并通过 Hiseq 对它们的混合基因组进行了测序。对每个菌株的参考基因组进行 BLAST 搜索的命中数进行了计数。从这些序列中选择的 176 个不同的小部分也进行了 BLAST 搜索，并对其命中数进行了计数，以便将其与原始全序列搜索结果进行比较。我们还从 MG-RAST 服务的 10 个可公开下载的研究数据中选择了部分序列的小部分，并使用 MG-RAST 对这些样本进行了分析。

结果

模拟群落的 BLAST 搜索测试和 MG-RAST 可公开下载研究的结果均表明，从序列数据中采样极小一部分对于估计原始宏基因组序列的简要分类信息是有用的。在 10 个案例中的 9 个案例中，从所选部分样本序列的 MG-RAST 分析中注释最多的类与原始序列中的类相同。

结论

当研究人员希望使用较少的计算资源和更短的时间来估计宏基因组的分类分布的简要信息时，研究人员可以分析宏基因组序列的一小部分。通过这种方法，我们还可以建立一种策略，更频繁地监测更广泛地理区域的宏基因组样本。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/123e/6215618/6de4c32033b2/12859_2018_2431_Fig1_HTML.jpg

相似文献

What we can see from very small size sample of metagenomic sequences.

BMC Bioinformatics. 2018 Nov 3;19(1):399. doi: 10.1186/s12859-018-2431-8.

MG-RAST, a Metagenomics Service for Analysis of Microbial Community Structure and Function.

Methods Mol Biol. 2016;1399:207-33. doi: 10.1007/978-1-4939-3369-3_13.

MinION™ nanopore sequencing of environmental metagenomes: a synthetic approach.

Gigascience. 2017 Mar 1;6(3):1-10. doi: 10.1093/gigascience/gix007.

Contigs directed gene annotation (ConDiGA) for accurate protein sequence database construction in metaproteomics.

Microbiome. 2024 Mar 19;12(1):58. doi: 10.1186/s40168-024-01775-3.

Improved metagenome assemblies and taxonomic binning using long-read circular consensus sequence data.

Sci Rep. 2016 May 9;6:25373. doi: 10.1038/srep25373.

Recovery of strain-resolved genomes from human microbiome through an integration framework of single-cell genomics and metagenomics.

Microbiome. 2021 Oct 12;9(1):202. doi: 10.1186/s40168-021-01152-4.

Construction of customized sub-databases from NCBI-nr database for rapid annotation of huge metagenomic datasets using a combined BLAST and MEGAN approach.

PLoS One. 2013;8(4):e59831. doi: 10.1371/journal.pone.0059831. Epub 2013 Apr 1.

GRASPx: efficient homolog-search of short peptide metagenome database through simultaneous alignment and assembly.

BMC Bioinformatics. 2016 Aug 31;17 Suppl 8(Suppl 8):283. doi: 10.1186/s12859-016-1119-1.

Selection of marker genes for genetic barcoding of microorganisms and binning of metagenomic reads by Barcoder software tools.

BMC Bioinformatics. 2018 Aug 30;19(1):309. doi: 10.1186/s12859-018-2320-1.

Evaluating the Quantitative Capabilities of Metagenomic Analysis Software.

Curr Microbiol. 2016 May;72(5):612-6. doi: 10.1007/s00284-016-0991-2. Epub 2016 Jan 30.

引用本文的文献

Bioprospecting Microbiome for Soil and Plant Health Management Amidst Huanglongbing Threat in Citrus: A Review.

Front Plant Sci. 2022 Apr 26;13:858842. doi: 10.3389/fpls.2022.858842. eCollection 2022.

Differences in the individual curative effect of acupuncture for obese women with polycystic ovary syndrome based on metagenomic analysis: study protocol for a randomized controlled trial.

Trials. 2021 Jul 15;22(1):454. doi: 10.1186/s13063-021-05426-y.

A Bioinformatics Guide to Plant Microbiome Analysis.

Front Plant Sci. 2019 Oct 23;10:1313. doi: 10.3389/fpls.2019.01313. eCollection 2019.

本文引用的文献

GenBank.

Nucleic Acids Res. 2017 Jan 4;45(D1):D37-D42. doi: 10.1093/nar/gkw1070. Epub 2016 Nov 28.

Centrifuge: rapid and sensitive classification of metagenomic sequences.

Genome Res. 2016 Dec;26(12):1721-1729. doi: 10.1101/gr.210641.116. Epub 2016 Oct 17.

Next-generation sequencing (NGS) in the microbiological world: How to make the most of your money.

J Microbiol Methods. 2017 Jul;138:60-71. doi: 10.1016/j.mimet.2016.02.016. Epub 2016 Mar 16.

Automated and Accurate Estimation of Gene Family Abundance from Shotgun Metagenomes.

PLoS Comput Biol. 2015 Nov 13;11(11):e1004573. doi: 10.1371/journal.pcbi.1004573. eCollection 2015 Nov.

Evaluation of shotgun metagenomics sequence classification methods using in silico and in vitro simulated communities.

BMC Bioinformatics. 2015 Nov 4;16:363. doi: 10.1186/s12859-015-0788-5.

Metagenome Skimming of Insect Specimen Pools: Potential for Comparative Genomics.

Genome Biol Evol. 2015 May 14;7(6):1474-89. doi: 10.1093/gbe/evv086.

Metagenomic analyses of bacteria on human hairs: a qualitative assessment for applications in forensic science.

Investig Genet. 2014 Dec 16;5(1):16. doi: 10.1186/s13323-014-0016-5. eCollection 2014.

Random whole metagenomic sequencing for forensic discrimination of soils.

PLoS One. 2014 Aug 11;9(8):e104996. doi: 10.1371/journal.pone.0104996. eCollection 2014.

An introduction to the analysis of shotgun metagenomic data.

Front Plant Sci. 2014 Jun 16;5:209. doi: 10.3389/fpls.2014.00209. eCollection 2014.

Strengths and limitations of 16S rRNA gene amplicon sequencing in revealing temporal microbial community dynamics.

PLoS One. 2014 Apr 8;9(4):e93827. doi: 10.1371/journal.pone.0093827. eCollection 2014.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

从非常小的宏基因组序列样本中我们可以看到什么。

What we can see from very small size sample of metagenomic sequences.

机构信息

Graduate Program in Technology Policy, Yonsei University, 50 Yonsei Ro, Seodaemun Gu, Seoul, 038722, South Korea.

School of Civil and Environmental Engineering, Yonsei University, 50 Yonsei Ro, Seodaemun Gu, Seoul, 038722, South Korea.