Suppr超能文献

Hecatomb:病毒宏基因组学的集成软件平台。

Hecatomb: an integrated software platform for viral metagenomics.

机构信息

Flinders Accelerator for Microbiome Exploration, Flinders University, Adelaide, SA, Australia.

Adelaide Centre for Epigenetics, University of Adelaide, Adelaide, SA, 5005, Australia.

出版信息

Gigascience. 2024 Jan 2;13. doi: 10.1093/gigascience/giae020.

Abstract

BACKGROUND

Modern sequencing technologies offer extraordinary opportunities for virus discovery and virome analysis. Annotation of viral sequences from metagenomic data requires a complex series of steps to ensure accurate annotation of individual reads and assembled contigs. In addition, varying study designs will require project-specific statistical analyses.

FINDINGS

Here we introduce Hecatomb, a bioinformatic platform coordinating commonly used tasks required for virome analysis. Hecatomb means "a great sacrifice." In this setting, Hecatomb is "sacrificing" false-positive viral annotations using extensive quality control and tiered-database searches. Hecatomb processes metagenomic data obtained from both short- and long-read sequencing technologies, providing annotations to individual sequences and assembled contigs. Results are provided in commonly used data formats useful for downstream analysis. Here we demonstrate the functionality of Hecatomb through the reanalysis of a primate enteric and a novel coral reef virome.

CONCLUSION

Hecatomb provides an integrated platform to manage many commonly used steps for virome characterization, including rigorous quality control, host removal, and both read- and contig-based analysis. Each step is managed using the Snakemake workflow manager with dependency management using Conda. Hecatomb outputs several tables properly formatted for immediate use within popular data analysis and visualization tools, enabling effective data interpretation for a variety of study designs. Hecatomb is hosted on GitHub (github.com/shandley/hecatomb) and is available for installation from Bioconda and PyPI.

摘要

背景

现代测序技术为病毒发现和病毒组分析提供了极好的机会。从宏基因组数据中注释病毒序列需要一系列复杂的步骤,以确保对单个读取和组装的连续序列进行准确注释。此外,不同的研究设计将需要特定于项目的统计分析。

发现

在这里,我们介绍了 Hecatomb,这是一个用于病毒组分析的生物信息学平台,协调了通常所需的任务。Hecatomb 的意思是“巨大的牺牲”。在这种情况下,Hecatomb 通过广泛的质量控制和分层数据库搜索来“牺牲”假阳性病毒注释。Hecatomb 处理来自短读和长读测序技术的宏基因组数据,为单个序列和组装的连续序列提供注释。结果以常用的数据格式提供,这些格式可用于下游分析。在这里,我们通过重新分析灵长类动物肠道病毒组和新型珊瑚礁病毒组来演示 Hecatomb 的功能。

结论

Hecatomb 提供了一个集成平台,用于管理病毒组特征描述的许多常用步骤,包括严格的质量控制、宿主去除以及基于读取和连续序列的分析。每个步骤都使用 Snakemake 工作流程管理器进行管理,使用 Conda 进行依赖管理。Hecatomb 输出了几个表格,这些表格经过适当格式化,可立即在流行的数据分析和可视化工具中使用,从而能够为各种研究设计进行有效的数据分析和解释。Hecatomb 托管在 GitHub(github.com/shandley/hecatomb)上,并可通过 Bioconda 和 PyPI 进行安装。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/39eb/11148595/079fee5fdca2/giae020fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验