Atxaerandio-Landa Aitor, Arrieta-Gisasola Ainhoa, Laorden Lorena, Bikandi Joseba, Garaizar Javier, Martinez-Malaxetxebarria Irati, Martinez-Ballesteros Ilargi
MikroIker Research Group, Department of Immunology, Microbiology, and Parasitology, Faculty of Pharmacy, University of the Basque Country UPV/EHU, 01006 Vitoria-Gasteiz, Spain.
Bioaraba, Microbiology, Infectious Disease, Antimicrobial Agents and Gene Therapy Group, 01009 Vitoria-Gasteiz, Spain.
Microorganisms. 2022 Nov 29;10(12):2364. doi: 10.3390/microorganisms10122364.
The use of whole-genome sequencing (WGS) for bacterial characterisation has increased substantially in the last decade. Its high throughput and decreasing cost have led to significant changes in outbreak investigations and surveillance of a wide variety of microbial pathogens. Despite the innumerable advantages of WGS, several drawbacks concerning data analysis and management, as well as a general lack of standardisation, hinder its integration in routine use. In this work, a bioinformatics workflow for (Illumina) WGS data is presented for bacterial characterisation including genome annotation, species identification, serotype prediction, antimicrobial resistance prediction, virulence-related genes and plasmid replicon detection, core-genome-based or single nucleotide polymorphism (SNP)-based phylogenetic clustering and sequence typing. Workflow was tested using a collection of 22 in-house sequences of isolates belonging to a local outbreak, coupled with a collection of 182 genomes publicly available. No errors were reported during the execution period, and all genomes were analysed. The bioinformatics workflow can be tailored to other pathogens of interest and is freely available for academic and non-profit use as an uploadable file to the Galaxy platform.
在过去十年中,全基因组测序(WGS)用于细菌特征分析的应用大幅增加。其高通量和成本降低导致了各类微生物病原体的暴发调查和监测发生了重大变化。尽管WGS有无数优点,但在数据分析和管理方面存在一些缺点,以及普遍缺乏标准化,阻碍了其在常规应用中的整合。在这项工作中,提出了一种用于(Illumina)WGS数据的生物信息学工作流程,用于细菌特征分析,包括基因组注释、物种鉴定、血清型预测、抗菌药物耐药性预测、毒力相关基因和质粒复制子检测、基于核心基因组或单核苷酸多态性(SNP)的系统发育聚类和序列分型。使用属于本地一次暴发的22个内部分离株序列集合以及182个公开可用的基因组集合对工作流程进行了测试。在执行期间未报告任何错误,并且对所有基因组进行了分析。该生物信息学工作流程可以针对其他感兴趣的病原体进行定制,并且作为可上传到Galaxy平台的文件免费提供给学术和非营利用途。