Araújo Pedro M M, Martins Joana S, Osório Nuno S
Life and Health Sciences Research institute (ICVS), School of Medicine, University of Minho, Braga, Portugal.
ICVS/3B's - PT Government Associate Laboratory, Braga, Guimarães, Portugal.
Virus Evol. 2019 Nov 19;5(2):vez050. doi: 10.1093/ve/vez050. eCollection 2019 Jul.
Human immunodeficiency virus 1 (HIV-1) genome sequencing is routinely done for drug resistance monitoring in hospitals worldwide. Subtyping these extensive datasets of HIV-1 sequences is a critical first step in molecular epidemiology and evolution studies. The clinical relevance of HIV-1 subtypes is increasingly recognized. Several studies suggest subtype-related differences in disease progression, transmission route efficiency, immune evasion, and even therapeutic outcomes. HIV-1 subtyping is mainly done using web-servers. These tools have limitations in scalability and potential noncompliance with data protection legislation. Thus, the aim of this work was to develop an efficient method for large-scale local HIV-1 subtyping. We designed SNAPPy: a snakemake pipeline for scalable HIV-1 subtyping by phylogenetic pairing. It contains several tasks of phylogenetic inference and BLAST queries, which can be executed sequentially or in parallel, taking advantage of multiple-core processing units. Although it was built for subtyping, SNAPPy is also useful to perform extensive HIV-1 alignments. This tool facilitates large-scale sequence-based HIV-1 research by providing a local, resource efficient and scalable alternative for HIV-1 subtyping. It is capable of analyzing full-length genomes or partial HIV-1 genomic regions (GAG, POL, and ENV) and recognizes more than ninety circulating recombinant forms. SNAPPy is freely available at: https://github.com/PMMAraujo/snappy/releases.
人类免疫缺陷病毒1型(HIV-1)基因组测序在全球医院中常用于耐药性监测。对这些庞大的HIV-1序列数据集进行亚型分类是分子流行病学和进化研究的关键第一步。HIV-1亚型的临床相关性日益受到认可。多项研究表明,在疾病进展、传播途径效率、免疫逃逸甚至治疗结果方面存在与亚型相关的差异。HIV-1亚型分类主要通过网络服务器进行。这些工具在可扩展性方面存在局限性,并且可能不符合数据保护法规。因此,这项工作的目的是开发一种高效的大规模本地HIV-1亚型分类方法。我们设计了SNAPPy:一种用于通过系统发育配对进行可扩展HIV-1亚型分类的Snakemake管道。它包含多个系统发育推断和BLAST查询任务,可以利用多核处理单元顺序或并行执行。尽管它是为亚型分类而构建的,但SNAPPy对于进行广泛的HIV-1比对也很有用。该工具通过为HIV-1亚型分类提供一种本地、资源高效且可扩展的替代方案,促进了基于序列的大规模HIV-1研究。它能够分析全长基因组或部分HIV-1基因组区域(GAG、POL和ENV),并识别九十多种循环重组形式。SNAPPy可在以下网址免费获取:https://github.com/PMMAraujo/snappy/releases。