Instituto de Microbiologia, Instituto de Medicina Molecular, Faculdade de Medicina, Universidade de Lisboa, Lisboa, Portugal.
University of Groningen, University Medical Center Groningen, Department of Medical Microbiology and Infection Prevention, Groningen, The Netherlands.
Microb Genom. 2020 Mar;6(3). doi: 10.1099/mgen.0.000328.
Dengue virus (DENV) represents a public health threat and economic burden in affected countries. The availability of genomic data is key to understanding viral evolution and dynamics, supporting improved control strategies. Currently, the use of high-throughput sequencing (HTS) technologies, which can be applied both directly to patient samples (shotgun metagenomics) and to PCR-amplified viral sequences (amplicon sequencing), is potentially the most informative approach to monitor viral dissemination and genetic diversity by providing, in a single methodological step, identification and characterization of the whole viral genome at the nucleotide level. Despite many advantages, these technologies require bioinformatics expertise and appropriate infrastructure for the analysis and interpretation of the resulting data. In addition, the many software solutions available can hamper the reproducibility and comparison of results. Here we present DEN-IM, a one-stop, user-friendly, containerized and reproducible workflow for the analysis of DENV short-read sequencing data from both amplicon and shotgun metagenomics approaches. It is able to infer the DENV coding sequence (CDS), identify the serotype and genotype, and generate a phylogenetic tree. It can easily be run on any UNIX-like system, from local machines to high-performance computing clusters, performing a comprehensive analysis without the requirement for extensive bioinformatics expertise. Using DEN-IM, we successfully analysed two types of DENV datasets. The first comprised 25 shotgun metagenomic sequencing samples from patients with variable serotypes and genotypes, including an spiked sample containing the four known serotypes. The second consisted of 106 paired-end and 76 single-end amplicon sequences of DENV 3 genotype III and DENV 1 genotype I, respectively, where DEN-IM allowed detection of the intra-genotype diversity. The DEN-IM workflow, parameters and execution configuration files, and documentation are freely available at https://github.com/B-UMMI/DEN-IM).
登革热病毒(DENV)对受影响国家的公共卫生和经济构成威胁。基因组数据的可用性是了解病毒进化和动态的关键,支持改进控制策略。目前,高通量测序(HTS)技术的应用,无论是直接对患者样本( shotgun 宏基因组学)还是对 PCR 扩增的病毒序列(扩增子测序),都是监测病毒传播和遗传多样性的最有信息量的方法,它在一个单一的方法步骤中,在核苷酸水平上提供了对整个病毒基因组的识别和特征描述。尽管有许多优点,但这些技术需要生物信息学专业知识和适当的基础设施来分析和解释产生的数据。此外,可用的许多软件解决方案可能会阻碍结果的可重复性和比较。在这里,我们介绍了 DEN-IM,这是一个一站式、用户友好的、容器化的、可重复的工作流程,用于分析来自扩增子和 shotgun 宏基因组学方法的 DENV 短读测序数据。它能够推断出 DENV 编码序列(CDS),识别血清型和基因型,并生成系统发育树。它可以很容易地在任何类 UNIX 系统上运行,从本地机器到高性能计算集群,无需广泛的生物信息学专业知识即可进行全面分析。使用 DEN-IM,我们成功地分析了两种类型的 DENV 数据集。第一个数据集由 25 个来自不同血清型和基因型患者的 shotgun 宏基因组测序样本组成,其中包括一个含有四种已知血清型的 spike 样本。第二个数据集由 106 对和 76 个 DENV 3 基因型 III 和 DENV 1 基因型 I 的单端扩增子序列组成,其中 DEN-IM 允许检测到同基因型的多样性。DEN-IM 工作流程、参数和执行配置文件以及文档可在 https://github.com/B-UMMI/DEN-IM 上免费获得。