• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

CheckV 评估宏基因组组装病毒基因组的质量和完整性。

CheckV assesses the quality and completeness of metagenome-assembled viral genomes.

机构信息

US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.

Department of Genetics, Evolution, Microbiology and Immunology, Institute of Biology, University of Campinas, Campinas, Brazil.

出版信息

Nat Biotechnol. 2021 May;39(5):578-585. doi: 10.1038/s41587-020-00774-7. Epub 2020 Dec 21.

DOI:10.1038/s41587-020-00774-7
PMID:33349699
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8116208/
Abstract

Millions of new viral sequences have been identified from metagenomes, but the quality and completeness of these sequences vary considerably. Here we present CheckV, an automated pipeline for identifying closed viral genomes, estimating the completeness of genome fragments and removing flanking host regions from integrated proviruses. CheckV estimates completeness by comparing sequences with a large database of complete viral genomes, including 76,262 identified from a systematic search of publicly available metagenomes, metatranscriptomes and metaviromes. After validation on mock datasets and comparison to existing methods, we applied CheckV to large and diverse collections of metagenome-assembled viral sequences, including IMG/VR and the Global Ocean Virome. This revealed 44,652 high-quality viral genomes (that is, >90% complete), although the vast majority of sequences were small fragments, which highlights the challenge of assembling viral genomes from short-read metagenomes. Additionally, we found that removal of host contamination substantially improved the accurate identification of auxiliary metabolic genes and interpretation of viral-encoded functions.

摘要

从宏基因组中已经鉴定出了数以百万计的新病毒序列,但这些序列的质量和完整性差异很大。在这里,我们介绍了 CheckV,这是一个用于识别封闭病毒基因组、估计基因组片段完整性并从整合前病毒中去除侧翼宿主区域的自动化管道。CheckV 通过将序列与一个包含大量完整病毒基因组的数据库进行比较来估计完整性,其中包括从系统搜索公开可用的宏基因组、宏转录组和宏病毒组中鉴定出的 76,262 个基因组。在对模拟数据集进行验证并与现有方法进行比较后,我们将 CheckV 应用于包括 IMG/VR 和全球海洋病毒组在内的大量多样的宏基因组组装病毒序列集合。这揭示了 44,652 个高质量的病毒基因组(即>90%完整),尽管绝大多数序列都是小片段,这突出了从短读长宏基因组组装病毒基因组的挑战。此外,我们发现去除宿主污染可以显著提高辅助代谢基因的准确识别和病毒编码功能的解释。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/386b/8116208/227271041956/41587_2020_774_Fig15_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/386b/8116208/c73c3dbc98f5/41587_2020_774_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/386b/8116208/43de92705811/41587_2020_774_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/386b/8116208/6ba370527032/41587_2020_774_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/386b/8116208/9444d79a717b/41587_2020_774_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/386b/8116208/db4aa5489d61/41587_2020_774_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/386b/8116208/500f13cf442c/41587_2020_774_Fig6_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/386b/8116208/228d3f82ac2a/41587_2020_774_Fig7_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/386b/8116208/5ca58bf12ea3/41587_2020_774_Fig8_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/386b/8116208/acb99330ea1c/41587_2020_774_Fig9_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/386b/8116208/42d509fea590/41587_2020_774_Fig10_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/386b/8116208/5e8ae300a3f5/41587_2020_774_Fig11_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/386b/8116208/3a337a2550dc/41587_2020_774_Fig12_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/386b/8116208/db90c595d6fb/41587_2020_774_Fig13_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/386b/8116208/45c2c19f601e/41587_2020_774_Fig14_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/386b/8116208/227271041956/41587_2020_774_Fig15_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/386b/8116208/c73c3dbc98f5/41587_2020_774_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/386b/8116208/43de92705811/41587_2020_774_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/386b/8116208/6ba370527032/41587_2020_774_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/386b/8116208/9444d79a717b/41587_2020_774_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/386b/8116208/db4aa5489d61/41587_2020_774_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/386b/8116208/500f13cf442c/41587_2020_774_Fig6_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/386b/8116208/228d3f82ac2a/41587_2020_774_Fig7_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/386b/8116208/5ca58bf12ea3/41587_2020_774_Fig8_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/386b/8116208/acb99330ea1c/41587_2020_774_Fig9_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/386b/8116208/42d509fea590/41587_2020_774_Fig10_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/386b/8116208/5e8ae300a3f5/41587_2020_774_Fig11_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/386b/8116208/3a337a2550dc/41587_2020_774_Fig12_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/386b/8116208/db90c595d6fb/41587_2020_774_Fig13_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/386b/8116208/45c2c19f601e/41587_2020_774_Fig14_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/386b/8116208/227271041956/41587_2020_774_Fig15_ESM.jpg

相似文献

1
CheckV assesses the quality and completeness of metagenome-assembled viral genomes.CheckV 评估宏基因组组装病毒基因组的质量和完整性。
Nat Biotechnol. 2021 May;39(5):578-585. doi: 10.1038/s41587-020-00774-7. Epub 2020 Dec 21.
2
VIBRANT: automated recovery, annotation and curation of microbial viruses, and evaluation of viral community function from genomic sequences.VIBRANT:从基因组序列中自动恢复、注释和培养微生物病毒,并评估病毒群落功能。
Microbiome. 2020 Jun 10;8(1):90. doi: 10.1186/s40168-020-00867-0.
3
IMG/VR v4: an expanded database of uncultivated virus genomes within a framework of extensive functional, taxonomic, and ecological metadata.IMG/VR v4:一个扩展的未培养病毒基因组数据库,其中包含广泛的功能、分类和生态元数据框架。
Nucleic Acids Res. 2023 Jan 6;51(D1):D733-D743. doi: 10.1093/nar/gkac1037.
4
Genome binning of viral entities from bulk metagenomics data.宏基因组数据中病毒类群的基因组分箱。
Nat Commun. 2022 Feb 18;13(1):965. doi: 10.1038/s41467-022-28581-5.
5
IMG/VR v3: an integrated ecological and evolutionary framework for interrogating genomes of uncultivated viruses.IMG/VR v3:用于探究未培养病毒基因组的综合生态和进化框架。
Nucleic Acids Res. 2021 Jan 8;49(D1):D764-D775. doi: 10.1093/nar/gkaa946.
6
drVM: a new tool for efficient genome assembly of known eukaryotic viruses from metagenomes.drVM:一种用于从宏基因组中高效组装已知真核病毒基因组的新工具。
Gigascience. 2017 Feb 1;6(2):1-10. doi: 10.1093/gigascience/gix003.
7
Optimizing and evaluating the reconstruction of Metagenome-assembled microbial genomes.优化和评估宏基因组组装微生物基因组的重建。
BMC Genomics. 2017 Nov 28;18(1):915. doi: 10.1186/s12864-017-4294-1.
8
Metavir 2: new tools for viral metagenome comparison and assembled virome analysis.Metavir 2:用于病毒宏基因组比较和组装病毒组分析的新工具。
BMC Bioinformatics. 2014 Mar 19;15:76. doi: 10.1186/1471-2105-15-76.
9
Long-Read Metagenomics Improves the Recovery of Viral Diversity from Complex Natural Marine Samples.长读宏基因组提高了从复杂自然海洋样本中病毒多样性的恢复。
mSystems. 2022 Jun 28;7(3):e0019222. doi: 10.1128/msystems.00192-22. Epub 2022 Jun 13.
10
IMG/VR v.2.0: an integrated data management and analysis system for cultivated and environmental viral genomes.IMG/VR v.2.0:一个用于培养和环境病毒基因组的集成数据管理和分析系统。
Nucleic Acids Res. 2019 Jan 8;47(D1):D678-D686. doi: 10.1093/nar/gky1127.

引用本文的文献

1
Characterization of the novel Cutibacterium acnes phage KIT08 and its associated pseudolysogenic bacterial isolate.新型痤疮丙酸杆菌噬菌体KIT08及其相关假溶原性细菌分离株的特性分析
Arch Microbiol. 2025 Sep 11;207(10):261. doi: 10.1007/s00203-025-04451-8.
2
Genomic characterization of novel bat kobuviruses in Madagascar: Implications for viral evolution and zoonotic risk.马达加斯加新型蝙蝠杯状病毒的基因组特征:对病毒进化和人畜共患病风险的影响。
PLoS One. 2025 Sep 10;20(9):e0331736. doi: 10.1371/journal.pone.0331736. eCollection 2025.
3
Metagenomic profiling of the insect-specific virome in non-urban mosquitoes (Culicidae: Culicinae) from Colombia's Northern inter-Andean valleys.

本文引用的文献

1
VIBRANT: automated recovery, annotation and curation of microbial viruses, and evaluation of viral community function from genomic sequences.VIBRANT:从基因组序列中自动恢复、注释和培养微生物病毒,并评估病毒群落功能。
Microbiome. 2020 Jun 10;8(1):90. doi: 10.1186/s40168-020-00867-0.
2
Metaviral SPAdes: assembly of viruses from metagenomic data.Metaviral SPAdes:从宏基因组数据中组装病毒。
Bioinformatics. 2020 Aug 15;36(14):4126-4129. doi: 10.1093/bioinformatics/btaa490.
3
Phigaro: high-throughput prophage sequence annotation.
对来自哥伦比亚安第斯山脉北部山谷的非城市蚊子(蚊科:库蚊亚科)中昆虫特异性病毒组的宏基因组分析。
PLoS One. 2025 Sep 3;20(9):e0331552. doi: 10.1371/journal.pone.0331552. eCollection 2025.
4
Ongoing circulation of emerging tick-borne viruses in Poland, Eastern Europe.东欧波兰新出现的蜱传病毒持续传播。
PLoS One. 2025 Sep 3;20(9):e0330544. doi: 10.1371/journal.pone.0330544. eCollection 2025.
5
Phage quest: a beginner's guide to explore viral diversity in the prokaryotic world.噬菌体探索:探索原核生物世界中病毒多样性的初学者指南。
Brief Bioinform. 2025 Aug 31;26(5). doi: 10.1093/bib/bbaf449.
6
A prevalent huge phage clade in human and animal gut microbiomes.在人类和动物肠道微生物群中普遍存在的一个巨大噬菌体分支。
Res Sq. 2025 Aug 19:rs.3.rs-7356405. doi: 10.21203/rs.3.rs-7356405/v1.
7
Evaluation of shotgun metagenomics as a diagnostic tool for infectious gastroenteritis.鸟枪法宏基因组学作为感染性肠胃炎诊断工具的评估
PLoS One. 2025 Sep 2;20(9):e0331288. doi: 10.1371/journal.pone.0331288. eCollection 2025.
8
Global biogeography of airborne viruses in public transit systems and their host interactions.公共交通系统中空气传播病毒的全球生物地理学及其宿主相互作用。
Microbiome. 2025 Aug 29;13(1):193. doi: 10.1186/s40168-025-02173-z.
9
Temporal dynamics, microdiversity, and ecological functions of viral communities during cyanobacterial blooms in Lake Taihu.太湖蓝藻水华期间病毒群落的时间动态、微观多样性及生态功能
NPJ Biofilms Microbiomes. 2025 Aug 29;11(1):178. doi: 10.1038/s41522-025-00771-1.
10
Unique plastisphere viromes with habitat-dependent potential for modulating global methane cycle.具有依赖栖息地调节全球甲烷循环潜力的独特塑料球病毒群落。
Nat Commun. 2025 Aug 29;16(1):8098. doi: 10.1038/s41467-025-63215-6.
Phigaro:高通量噬菌体序列注释。
Bioinformatics. 2020 Jun 1;36(12):3882-3884. doi: 10.1093/bioinformatics/btaa250.
4
Assembly-free single-molecule sequencing recovers complete virus genomes from natural microbial communities.无组装的单分子测序从自然微生物群落中恢复完整的病毒基因组。
Genome Res. 2020 Mar;30(3):437-446. doi: 10.1101/gr.251686.119. Epub 2020 Feb 19.
5
Clades of huge phages from across Earth's ecosystems.来自地球生态系统的巨型噬菌体的进化枝。
Nature. 2020 Feb;578(7795):425-431. doi: 10.1038/s41586-020-2007-4. Epub 2020 Feb 12.
6
Giant virus diversity and host interactions through global metagenomics.通过全球宏基因组学研究巨型病毒的多样性及其与宿主的相互作用。
Nature. 2020 Feb;578(7795):432-436. doi: 10.1038/s41586-020-1957-x. Epub 2020 Jan 22.
7
Charting the diversity of uncultured viruses of Archaea and Bacteria.绘制未培养古菌和细菌病毒多样性图谱。
BMC Biol. 2019 Dec 29;17(1):109. doi: 10.1186/s12915-019-0723-8.
8
MGnify: the microbiome analysis resource in 2020.MGnify:2020 年的微生物组分析资源。
Nucleic Acids Res. 2020 Jan 8;48(D1):D570-D578. doi: 10.1093/nar/gkz1035.
9
GenBank.GenBank
Nucleic Acids Res. 2020 Jan 8;48(D1):D84-D86. doi: 10.1093/nar/gkz956.
10
CRISPR-Cas System of a Prevalent Human Gut Bacterium Reveals Hyper-targeting against Phages in a Human Virome Catalog.普遍存在于人类肠道细菌中的 CRISPR-Cas 系统揭示了对人类病毒组目录中噬菌体的超靶向性。
Cell Host Microbe. 2019 Sep 11;26(3):325-335.e5. doi: 10.1016/j.chom.2019.08.008. Epub 2019 Sep 3.