• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Virseqimprover:用于病毒重叠群纠错、延伸和注释的综合流程

Virseqimprover: an integrated pipeline for viral contig error correction, extension, and annotation.

作者信息

Song Haoqiu, Tithi Saima Sultana, Brown Connor, Aylward Frank O, Jensen Roderick, Zhang Liqing

机构信息

Department of Computer Science, Virginia Polytechnic Institute and State University (Virginia Tech), Blacksburg, VA, United States of America.

Department of Cell & Molecular Biology, St. Jude Children's Research Hospital, Memphis, TN, United States of America.

出版信息

PeerJ. 2025 Jan 10;13:e18515. doi: 10.7717/peerj.18515. eCollection 2025.

DOI:10.7717/peerj.18515
PMID:39807156
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11727651/
Abstract

Despite the recent surge of viral metagenomic studies, it remains a significant challenge to recover complete virus genomes from metagenomic data. The majority of viral contigs generated from de novo assembly programs are highly fragmented, presenting significant challenges to downstream analysis and inference. To address this issue, we have developed Virseqimprover, a computational pipeline that can extend assembled contigs to complete or nearly complete genomes while maintaining extension quality. Virseqimprover first examines whether there is any chimeric sequence based on read coverage, breaks the sequence into segments if there is, then extends the longest segment with uniform depth of coverage, and repeats these procedures until the sequence cannot be extended. Finally, Virseqimprover annotates the gene content of the resulting sequence. Results show that Virseqimprover has good performances on correcting and extending viral contigs to their full lengths, hence can be a useful tool to improve the completeness and minimize the assembly errors of viral contigs. Both a web server and a conda package for Virseqimprover are provided to the research community free of charge.

摘要

尽管最近病毒宏基因组学研究激增,但从宏基因组数据中恢复完整病毒基因组仍然是一项重大挑战。从头组装程序生成的大多数病毒重叠群高度碎片化,给下游分析和推断带来了重大挑战。为了解决这个问题,我们开发了Virseqimprover,这是一种计算流程,它可以将组装好的重叠群扩展为完整或近乎完整的基因组,同时保持扩展质量。Virseqimprover首先根据读段覆盖度检查是否存在嵌合序列,如果存在则将序列拆分成片段,然后以均匀的覆盖深度扩展最长片段,并重复这些步骤,直到序列无法再扩展。最后,Virseqimprover注释所得序列的基因内容。结果表明,Virseqimprover在将病毒重叠群校正并扩展至全长方面具有良好性能,因此可以成为提高病毒重叠群完整性并将组装错误降至最低的有用工具。我们向研究界免费提供了Virseqimprover的网络服务器和conda包。

相似文献

1
Virseqimprover: an integrated pipeline for viral contig error correction, extension, and annotation.Virseqimprover:用于病毒重叠群纠错、延伸和注释的综合流程
PeerJ. 2025 Jan 10;13:e18515. doi: 10.7717/peerj.18515. eCollection 2025.
2
FastViromeExplorer-Novel: Recovering Draft Genomes of Novel Viruses and Phages in Metagenomic Data.FastViromeExplorer-Novel:从宏基因组数据中恢复新型病毒和噬菌体的草图基因组。
J Comput Biol. 2023 Apr;30(4):391-408. doi: 10.1089/cmb.2022.0397. Epub 2023 Jan 6.
3
Comparison of different assembly and annotation tools on analysis of simulated viral metagenomic communities in the gut.比较不同的组装和注释工具在分析肠道中模拟病毒宏基因组群落中的应用。
BMC Genomics. 2014 Jan 18;15:37. doi: 10.1186/1471-2164-15-37.
4
MVP: a modular viromics pipeline to identify, filter, cluster, annotate, and bin viruses from metagenomes.MVP:一个模块化的病毒组学分析流程,用于从宏基因组中识别、过滤、聚类、注释和分类病毒。
mSystems. 2024 Oct 22;9(10):e0088824. doi: 10.1128/msystems.00888-24. Epub 2024 Oct 1.
5
drVM: a new tool for efficient genome assembly of known eukaryotic viruses from metagenomes.drVM:一种用于从宏基因组中高效组装已知真核病毒基因组的新工具。
Gigascience. 2017 Feb 1;6(2):1-10. doi: 10.1093/gigascience/gix003.
6
Prokaryotic Contig Annotation Pipeline Server: Web Application for a Prokaryotic Genome Annotation Pipeline Based on the Shiny App Package.原核生物重叠群注释管道服务器:基于Shiny应用包的原核生物基因组注释管道的网络应用程序。
J Comput Biol. 2017 Sep;24(9):917-922. doi: 10.1089/cmb.2017.0066. Epub 2017 Jun 20.
7
VirFinder: a novel k-mer based tool for identifying viral sequences from assembled metagenomic data.VirFinder:一种新型的基于 k-mer 的工具,用于从组装的宏基因组数据中识别病毒序列。
Microbiome. 2017 Jul 6;5(1):69. doi: 10.1186/s40168-017-0283-5.
8
Improved assemblies using a source-agnostic pipeline for MetaGenomic Assembly by Merging (MeGAMerge) of contigs.通过对重叠群进行宏基因组组装合并(MeGAMerge),使用与源无关的流程改进组装。
Sci Rep. 2014 Oct 1;4:6480. doi: 10.1038/srep06480.
9
ContigExtender: a new approach to improving de novo sequence assembly for viral metagenomics data.ContigExtender:一种改进病毒宏基因组数据从头测序组装的新方法。
BMC Bioinformatics. 2021 Mar 12;22(1):119. doi: 10.1186/s12859-021-04038-2.
10
Hecatomb: an integrated software platform for viral metagenomics.Hecatomb:病毒宏基因组学的集成软件平台。
Gigascience. 2024 Jan 2;13. doi: 10.1093/gigascience/giae020.

本文引用的文献

1
Phage DNA Extraction, Genome Assembly, and Genome Closure.噬菌体 DNA 提取、基因组组装和基因组闭合。
Methods Mol Biol. 2024;2738:125-144. doi: 10.1007/978-1-0716-3549-0_8.
2
kb_DRAM: annotation and metabolic profiling of genomes with DRAM in KBase.kb_DRAM:在 KBase 中使用 DRAM 对基因组进行注释和代谢物分析。
Bioinformatics. 2023 Apr 3;39(4). doi: 10.1093/bioinformatics/btad110.
3
FastViromeExplorer-Novel: Recovering Draft Genomes of Novel Viruses and Phages in Metagenomic Data.FastViromeExplorer-Novel:从宏基因组数据中恢复新型病毒和噬菌体的草图基因组。
J Comput Biol. 2023 Apr;30(4):391-408. doi: 10.1089/cmb.2022.0397. Epub 2023 Jan 6.
4
Pharokka: a fast scalable bacteriophage annotation tool.Pharokka:一种快速可扩展的噬菌体注释工具。
Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac776.
5
IMG/VR v4: an expanded database of uncultivated virus genomes within a framework of extensive functional, taxonomic, and ecological metadata.IMG/VR v4:一个扩展的未培养病毒基因组数据库,其中包含广泛的功能、分类和生态元数据框架。
Nucleic Acids Res. 2023 Jan 6;51(D1):D733-D743. doi: 10.1093/nar/gkac1037.
6
Search and sequence analysis tools services from EMBL-EBI in 2022.2022 年 EMBL-EBI 的搜索和序列分析工具服务。
Nucleic Acids Res. 2022 Jul 5;50(W1):W276-W279. doi: 10.1093/nar/gkac240.
7
Generating lineage-resolved, complete metagenome-assembled genomes from complex microbial communities.从复杂微生物群落中生成谱系分辨的、完整的宏基因组组装基因组。
Nat Biotechnol. 2022 May;40(5):711-719. doi: 10.1038/s41587-021-01130-z. Epub 2022 Jan 3.
8
Database resources of the national center for biotechnology information.国家生物技术信息中心数据库资源。
Nucleic Acids Res. 2022 Jan 7;50(D1):D20-D26. doi: 10.1093/nar/gkab1112.
9
MetaVelvet-DL: a MetaVelvet deep learning extension for de novo metagenome assembly.MetaVelvet-DL:一种用于从头宏基因组组装的 MetaVelvet 深度学习扩展。
BMC Bioinformatics. 2021 Jun 2;22(Suppl 6):427. doi: 10.1186/s12859-020-03737-6.
10
ContigExtender: a new approach to improving de novo sequence assembly for viral metagenomics data.ContigExtender:一种改进病毒宏基因组数据从头测序组装的新方法。
BMC Bioinformatics. 2021 Mar 12;22(1):119. doi: 10.1186/s12859-021-04038-2.