Suppr超能文献

应对病毒基因组学中的数据管理与分析挑战:瑞士HIV队列研究病毒下一代测序数据库

Addressing data management and analysis challenges in viral genomics: The Swiss HIV cohort study viral next generation sequencing database.

作者信息

Zeeb Marius, Frischknecht Paul, Balakrishna Suraj, Jörimann Lisa, Tschumi Jasmin, Zsichla Levente, Chaudron Sandra E, Jaha Bashkim, Neumann Kathrin, Leemann Christine, Huber Michael, Leuzinger Karoline, Günthard Huldrych F, Metzner Karin J, Kouyos Roger D

机构信息

Department of Infectious Diseases and Hospital Epidemiology, University Hospital Zurich, Zurich, Switzerland.

Institute of Medical Virology, University of Zurich, Zurich, Switzerland.

出版信息

PLOS Digit Health. 2025 Apr 21;4(4):e0000825. doi: 10.1371/journal.pdig.0000825. eCollection 2025 Apr.

Abstract

Numerous HIV related outcomes can be determined on the viral genome, for example, resistance associated mutations, population transmission dynamics, viral heritability traits, or time since infection. Viral sequences of people with HIV (PWH) are therefore essential for therapeutic and research purposes. While in the first three decades of the HIV pandemic viral genomes were mainly sequenced using Sanger sequencing, the last decade has seen a shift towards next-generation sequencing (NGS) as the preferred method. NGS can achieve near full length genome sequence coverage and simultaneously, it accurately encapsulates the within-host diversity by characterizing HIV subpopulations. NGS opens new avenues for HIV research, but it also presents challenges concerning data management and analysis. We therefore set up the Swiss HIV Cohort Study Viral NGS Database (SHCND) to address key issues in the handling of NGS data including high loads of raw- and processed NGS data, data storage solutions, downstream application of sophisticated bioinformatic tools, high-performance computing resources, and reproducibility. The database is nested within the Swiss HIV Cohort Study (SHCS) and the Zurich Primary HIV Infection Cohort Study (ZPHI), which together enrolled 21,876 PWH since 1988 and include a biobank dating back to the early nineties. Since its initiation in 2018, the SHCND accumulated NGS sequences (plasma and proviral origin) of 5,178 unique PWH. We here describe the design, set-up, and use of this NGS database. Overall, the SHCND has contributed to several research projects on HIV pathogenesis, treatment, drug resistance, and molecular epidemiology, and has thereby become a central part of HIV-genomics research in Switzerland.

摘要

可以根据病毒基因组确定许多与HIV相关的结果,例如,耐药相关突变、群体传播动态、病毒遗传特性或感染时间。因此,HIV感染者(PWH)的病毒序列对于治疗和研究目的至关重要。在HIV流行的头三十年中,病毒基因组主要通过桑格测序进行测序,而在过去十年中,已转向将下一代测序(NGS)作为首选方法。NGS可以实现近乎全长的基因组序列覆盖,同时,通过对HIV亚群进行特征分析,它能够准确地概括宿主内的多样性。NGS为HIV研究开辟了新途径,但它也带来了数据管理和分析方面的挑战。因此,我们建立了瑞士HIV队列研究病毒NGS数据库(SHCND),以解决NGS数据处理中的关键问题,包括大量的原始和处理后的NGS数据、数据存储解决方案、复杂生物信息学工具的下游应用、高性能计算资源以及可重复性。该数据库嵌套在瑞士HIV队列研究(SHCS)和苏黎世原发性HIV感染队列研究(ZPHI)之中,自1988年以来,这两个研究共招募了21,876名PWH,并且包括一个可追溯到九十年代初的生物样本库。自2018年启动以来,SHCND积累了5178名独特PWH的NGS序列(血浆和前病毒来源)。我们在此描述这个NGS数据库的设计、建立和使用情况。总体而言,SHCND为多个关于HIV发病机制、治疗、耐药性和分子流行病学的研究项目做出了贡献,从而成为瑞士HIV基因组学研究的核心部分。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/66e1/12011223/487b25bc7e19/pdig.0000825.g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验