Suppr
超能文献

比较不同的组装和注释工具在分析肠道中模拟病毒宏基因组群落中的应用。

Comparison of different assembly and annotation tools on analysis of simulated viral metagenomic communities in the gut.

机构信息

Fundación para el Fomento de la Investigación Sanitaria y Biomédica de la Comunidad Valencia (FISABIO)-Salud Pública, Avenida de Cataluña 21, 46020 Valencia, Spain.

出版信息

BMC Genomics. 2014 Jan 18;15:37. doi: 10.1186/1471-2164-15-37.

DOI:10.1186/1471-2164-15-37

PMID:24438450

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3901335/

Abstract

BACKGROUND

The main limitations in the analysis of viral metagenomes are perhaps the high genetic variability and the lack of information in extant databases. To address these issues, several bioinformatic tools have been specifically designed or adapted for metagenomics by improving read assembly and creating more sensitive methods for homology detection. This study compares the performance of different available assemblers and taxonomic annotation software using simulated viral-metagenomic data.

RESULTS

We simulated two 454 viral metagenomes using genomes from NCBI's RefSeq database based on the list of actual viruses found in previously published metagenomes. Three different assembly strategies, spanning six assemblers, were tested for performance: overlap-layout-consensus algorithms Newbler, Celera and Minimo; de Bruijn graphs algorithms Velvet and MetaVelvet; and read probabilistic model Genovo. The performance of the assemblies was measured by the length of resulting contigs (using N50), the percentage of reads assembled and the overall accuracy when comparing against corresponding reference genomes. Additionally, the number of chimeras per contig and the lowest common ancestor were estimated in order to assess the effect of assembling on taxonomic and functional annotation. The functional classification of the reads was evaluated by counting the reads that correctly matched the functional data previously reported for the original genomes and calculating the number of over-represented functional categories in chimeric contigs. The sensitivity and specificity of tBLASTx, PhymmBL and the k-mer frequencies were measured by accurate predictions when comparing simulated reads against the NCBI Virus genomes RefSeq database.

CONCLUSIONS

Assembling improves functional annotation by increasing accurate assignations and decreasing ambiguous hits between viruses and bacteria. However, the success is limited by the chimeric contigs occurring at all taxonomic levels. The assembler and its parameters should be selected based on the focus of each study. Minimo's non-chimeric contigs and Genovo's long contigs excelled in taxonomy assignation and functional annotation, respectively.tBLASTx stood out as the best approach for taxonomic annotation for virus identification. PhymmBL proved useful in datasets in which no related sequences are present as it uses genomic features that may help identify distant taxa. The k-frequencies underperformed in all viral datasets.

摘要

背景

病毒宏基因组分析的主要限制因素可能是遗传变异性高和现有数据库中信息缺乏。为了解决这些问题，已经专门设计或改编了几种生物信息学工具，通过改进读段组装和创建更敏感的同源检测方法来进行宏基因组学分析。本研究使用模拟的病毒宏基因组数据比较了不同可用组装器和分类注释软件的性能。

结果

我们根据之前发表的宏基因组中实际病毒的列表，使用 NCBI 的 RefSeq 数据库中的基因组模拟了两个 454 病毒宏基因组。我们测试了三种不同的组装策略（共涉及 6 个组装器）：重叠布局共识算法 Newbler、Celera 和 Minimo；de Bruijn 图算法 Velvet 和 MetaVelvet；以及读段概率模型 Genovo。通过比较组装结果的 contig 长度（使用 N50）、组装读段的百分比和与相应参考基因组的整体准确性来衡量组装的性能。此外，还估计了每个 contig 的嵌合体数量和最低共同祖先，以评估组装对分类和功能注释的影响。通过计算与原始基因组先前报道的功能数据相匹配的读段数量，并计算嵌合体 contig 中过度代表的功能类别数量，评估了读段的功能分类。通过将模拟读段与 NCBI Virus genomes RefSeq 数据库进行准确比较，测量了 tBLASTx、PhymmBL 和 k-mer 频率的灵敏度和特异性。

结论

组装通过增加准确分配和减少病毒和细菌之间的模糊命中来提高功能注释。然而，成功受到所有分类水平嵌合体 contig 的限制。应根据每个研究的重点选择组装器及其参数。Minimo 的非嵌合体 contig 和 Genovo 的长 contig 在分类学分配和功能注释方面表现出色，而 tBLASTx 在病毒鉴定的分类注释方面表现突出。PhymmBL 在没有相关序列的数据集上很有用，因为它使用可能有助于识别远缘分类群的基因组特征。在所有病毒数据集上，k-frequencies 的性能都不佳。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e2da/3901335/64a520396d2e/1471-2164-15-37-1.jpg

相似文献

Comparison of different assembly and annotation tools on analysis of simulated viral metagenomic communities in the gut.

BMC Genomics. 2014 Jan 18;15:37. doi: 10.1186/1471-2164-15-37.

Fragmentation and Coverage Variation in Viral Metagenome Assemblies, and Their Effect in Diversity Calculations.

Front Bioeng Biotechnol. 2015 Sep 17;3:141. doi: 10.3389/fbioe.2015.00141. eCollection 2015.

Cataloguing the taxonomic origins of sequences from a heterogeneous sample using phylogenomics: applications in adventitious agent detection.

PDA J Pharm Sci Technol. 2014 Nov-Dec;68(6):602-18. doi: 10.5731/pdajpst.2014.01023.

Evaluation of short read metagenomic assembly.

BMC Genomics. 2011;12 Suppl 2(Suppl 2):S8. doi: 10.1186/1471-2164-12-S2-S8. Epub 2011 Jul 27.

Comparison of assembly algorithms for improving rate of metatranscriptomic functional annotation.

Microbiome. 2014 Oct 28;2:39. doi: 10.1186/2049-2618-2-39. eCollection 2014.

Benchmarking of de novo assembly algorithms for Nanopore data reveals optimal performance of OLC approaches.

BMC Genomics. 2016 Aug 22;17 Suppl 7(Suppl 7):507. doi: 10.1186/s12864-016-2895-8.

MetaVelvet-SL: an extension of the Velvet assembler to a de novo metagenomic assembler utilizing supervised learning.

DNA Res. 2015 Feb;22(1):69-77. doi: 10.1093/dnares/dsu041. Epub 2014 Nov 27.

MetaCluster-TA: taxonomic annotation for metagenomic data based on assembly-assisted binning.

BMC Genomics. 2014;15 Suppl 1(Suppl 1):S12. doi: 10.1186/1471-2164-15-S1-S12. Epub 2014 Jan 24.

Improving contig binning of metagenomic data using [Formula: see text] oligonucleotide frequency dissimilarity.

BMC Bioinformatics. 2017 Sep 20;18(1):425. doi: 10.1186/s12859-017-1835-1.

VirFinder: a novel k-mer based tool for identifying viral sequences from assembled metagenomic data.

Microbiome. 2017 Jul 6;5(1):69. doi: 10.1186/s40168-017-0283-5.

引用本文的文献

Biases from Oxford Nanopore library preparation kits and their effects on microbiome and genome analysis.

BMC Genomics. 2025 May 19;26(1):504. doi: 10.1186/s12864-025-11649-z.

Virseqimprover: an integrated pipeline for viral contig error correction, extension, and annotation.

PeerJ. 2025 Jan 10;13:e18515. doi: 10.7717/peerj.18515. eCollection 2025.

Correlation between the gut microbiome and neurodegenerative diseases: a review of metagenomics evidence.

Neural Regen Res. 2024 Apr;19(4):833-845. doi: 10.4103/1673-5374.382223.

comparative genome analysis demonstrates genome heterogeneity and reduction in species isolated from animals and associated with human illness.

Heliyon. 2023 Jun 27;9(7):e17652. doi: 10.1016/j.heliyon.2023.e17652. eCollection 2023 Jul.

ViralCC retrieves complete viral genomes and virus-host pairs from metagenomic Hi-C data.

Nat Commun. 2023 Jan 31;14(1):502. doi: 10.1038/s41467-023-35945-y.

Viral Eco-Genomic Tools: Development and Implementation for Aquatic Biomonitoring.

Int J Environ Res Public Health. 2022 Jun 23;19(13):7707. doi: 10.3390/ijerph19137707.

Rapid screening and identification of viral pathogens in metagenomic data.

BMC Med Genomics. 2021 Dec 14;14(Suppl 6):289. doi: 10.1186/s12920-021-01138-z.

The human oral virome: Shedding light on the dark matter.

Periodontol 2000. 2021 Oct;87(1):282-298. doi: 10.1111/prd.12396.

The Human Gut Phageome: Origins and Roles in the Human Gut Microbiome.

Front Cell Infect Microbiol. 2021 Jun 4;11:643214. doi: 10.3389/fcimb.2021.643214. eCollection 2021.

Genome-resolved metagenomics using environmental and clinical samples.

Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab030.

本文引用的文献

Estimation of viral richness from shotgun metagenomes using a frequency count approach.

Microbiome. 2013 Feb 4;1(1):5. doi: 10.1186/2049-2618-1-5.

Study of the viral and microbial communities associated with Crohn's disease: a metagenomic approach.

Clin Transl Gastroenterol. 2013 Jun 13;4(6):e36. doi: 10.1038/ctg.2013.9.

The Pacific Ocean virome (POV): a marine viral metagenomic dataset and associated protein clusters for quantitative viral ecology.

PLoS One. 2013;8(2):e57355. doi: 10.1371/journal.pone.0057355. Epub 2013 Feb 28.

VIROME: a standard operating procedure for analysis of viral metagenome sequences.

Stand Genomic Sci. 2012 Jul 30;6(3):427-39. doi: 10.4056/sigs.2945050. Epub 2012 Jul 27.

Detection of novel viruses in porcine fecal samples from China.

Virol J. 2013 Jan 30;10:39. doi: 10.1186/1743-422X-10-39.

Ray Meta: scalable de novo metagenome assembly and profiling.

Genome Biol. 2012 Dec 22;13(12):R122. doi: 10.1186/gb-2012-13-12-r122.

Evaluating de Bruijn graph assemblers on 454 transcriptomic data.

PLoS One. 2012;7(12):e51188. doi: 10.1371/journal.pone.0051188. Epub 2012 Dec 7.

Orthologous gene clusters and taxon signature genes for viruses of prokaryotes.

J Bacteriol. 2013 Mar;195(5):941-50. doi: 10.1128/JB.01801-12. Epub 2012 Dec 7.

Metagenomic exploration of viruses throughout the Indian Ocean.

PLoS One. 2012;7(10):e42047. doi: 10.1371/journal.pone.0042047. Epub 2012 Oct 17.

De novo assembly of highly diverse viral populations.

BMC Genomics. 2012 Sep 13;13:475. doi: 10.1186/1471-2164-13-475.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr超能文献

比较不同的组装和注释工具在分析肠道中模拟病毒宏基因组群落中的应用。

Comparison of different assembly and annotation tools on analysis of simulated viral metagenomic communities in the gut.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译