Institute of Microbiology and Immunology, Faculty of Medicine, University of Ljubljana, Zaloška cesta 4, 1000 Ljubljana, Slovenia.
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae597.
Over the past decade, there have been many improvements in the field of metagenomics, including sequencing technologies, advances in bioinformatics and the development of reference databases, but a one-size-fits-all sequencing and bioinformatics pipeline does not yet seem achievable. In this study, we address the bioinformatics part of the analysis by combining three methods into a three-step workflow that increases the sensitivity and specificity of clinical metagenomics and improves pathogen detection. The individual tools are combined into a user-friendly workflow suitable for analysing short paired-end (PE) and long reads from metagenomics datasets-MetaAll. To demonstrate the applicability of the developed workflow, four complicated clinical cases with different disease presentations and multiple samples collected from different biological sites as well as the CAMI Clinical pathogen detection challenge dataset were used. MetaAll was able to identify putative pathogens in all but one case. In this case, however, traditional microbiological diagnostics were also unsuccessful. In addition, co-infection with Haemophilus influenzae and Human rhinovirus C54 was detected in case 1 and co-infection with SARS-Cov-2 and Influenza A virus (FluA) subtype H3N2 was detected in case 3. In case 2, in which conventional diagnostics could not find a pathogen, mNGS pointed to Klebsiella pneumoniae as the suspected pathogen. Finally, this study demonstrated the importance of combining read classification, contig validation and targeted reference mapping for more reliable detection of infectious agents in clinical metagenome samples.
在过去的十年中,宏基因组学领域取得了许多进展,包括测序技术、生物信息学的进步和参考数据库的发展,但似乎还无法实现一刀切的测序和生物信息学流程。在这项研究中,我们通过将三种方法结合到一个三步工作流程中,解决了分析中的生物信息学部分,该工作流程提高了临床宏基因组学的灵敏度和特异性,并改善了病原体检测。将各个工具组合成一个用户友好的工作流程,适用于分析来自宏基因组数据集的短配对末端 (PE) 和长读-MetaAll。为了展示开发的工作流程的适用性,我们使用了四个复杂的临床病例,这些病例具有不同的临床表现,并且从不同的生物部位收集了多个样本,以及 CAMI 临床病原体检测挑战数据集。MetaAll 能够识别所有病例的潜在病原体,但有一个病例除外。然而,在这个病例中,传统的微生物学诊断也不成功。此外,在病例 1 中检测到流感嗜血杆菌和人类鼻病毒 C54 的合并感染,在病例 3 中检测到 SARS-CoV-2 和流感 A 病毒(FluA)亚型 H3N2 的合并感染。在病例 2 中,传统诊断无法找到病原体,mNGS 指出肺炎克雷伯菌为疑似病原体。最后,这项研究表明,对于更可靠地检测临床宏基因组样本中的感染性病原体,结合读分类、连续体验证和靶向参考映射非常重要。