Brittain John-Stuart, Tsui Joseph, Inward Rhys, Gutierrez Bernardo, Mwanyika Gaspary, Tegally Houriiyah, Huynh Tuyen, Githinji George, Tessema Sofonias Kifle, McCrone John T, Bhatt Samir, Dasgupta Abhishek, Ratcliffe Stephen, Kraemer Moritz U G
Oxford Research Software Engineering Group, University of Oxford, Oxford, England, UK.
Pandemic Sciences Institute, University of Oxford, Oxford, England, UK.
Wellcome Open Res. 2025 May 27;10:279. doi: 10.12688/wellcomeopenres.23824.1. eCollection 2025.
The increase in volume and diversity of relevant data on infectious diseases and their drivers provides opportunities to generate new scientific insights that can support 'real-time' decision-making in public health across outbreak contexts and enhance pandemic preparedness. However, utilising the wide array of clinical, genomic, epidemiological, and spatial data collected globally is difficult due to differences in data preprocessing, data science capacity, and access to hardware and cloud resources. To facilitate large-scale and routine analyses of infectious disease data at the local level (i.e. without sharing data across borders), we developed GRAPEVNE (Graphical Analytical Pipeline Development Environment), a platform enabling the construction of modular pipelines designed for complex and repetitive data analysis workflows through an intuitive graphical interface. Built on the workflow management system, GRAPEVNE streamlines the creation, execution, and sharing of analytical pipelines. Its modular approach already supports a diverse range of scientific applications, including genomic analysis, epidemiological modeling, and large-scale data processing. Each module in GRAPEVNE is a self-contained Snakemake workflow, complete with configurations, scripts, and metadata, enabling interoperability. The platform's open-source nature ensures ongoing community-driven development and scalability. GRAPEVNE empowers researchers and public health institutions by simplifying complex analytical workflows, fostering data-driven discovery, and enhancing reproducibility in computational research. Its user-driven ecosystem encourages continuous innovation in biomedical and epidemiological research but is applicable beyond that. Key use-cases include automated phylogenetic analysis of viral sequences, real-time outbreak monitoring, forecasting, and epidemiological data processing. For instance, our dengue virus pipeline demonstrates end-to-end automation from sequence retrieval to phylogeographic inference, leveraging established bioinformatics tools which can be deployed to any geographical context. For more details, see documentation at: https://grapevne.readthedocs.io.
传染病及其驱动因素相关数据在数量和多样性上的增加,为产生新的科学见解提供了机会,这些见解可支持在各种疫情背景下进行公共卫生领域的“实时”决策,并加强大流行防范。然而,由于数据预处理、数据科学能力以及硬件和云资源获取方面存在差异,利用全球收集的大量临床、基因组、流行病学和空间数据具有一定难度。为便于在地方层面(即不跨境共享数据)对传染病数据进行大规模常规分析,我们开发了GRAPEVNE(图形分析管道开发环境),这是一个平台,通过直观的图形界面,能够构建专为复杂且重复的数据分析工作流程设计的模块化管道。基于工作流管理系统构建的GRAPEVNE简化了分析管道的创建、执行和共享。其模块化方法已经支持了多种科学应用,包括基因组分析、流行病学建模和大规模数据处理。GRAPEVNE中的每个模块都是一个独立的Snakemake工作流,配有配置、脚本和元数据,实现了互操作性。该平台的开源性质确保了由社区持续推动的开发和可扩展性。GRAPEVNE通过简化复杂的分析工作流程、促进数据驱动的发现以及提高计算研究的可重复性,增强了研究人员和公共卫生机构的能力。其用户驱动的生态系统鼓励生物医学和流行病学研究的持续创新,但不仅限于此。关键用例包括病毒序列的自动系统发育分析、实时疫情监测、预测以及流行病学数据处理。例如,我们的登革热病毒管道展示了从序列检索到系统发育地理推断的端到端自动化,利用了既定的生物信息学工具,这些工具可部署到任何地理环境中。欲了解更多详情,请参阅文档:https://grapevne.readthedocs.io 。
Wellcome Open Res. 2025-5-27
Arch Ital Urol Androl. 2025-6-30
Cochrane Database Syst Rev. 2022-10-4
J Health Organ Manag. 2025-6-30
Health Soc Care Deliv Res. 2025-5-21
NIH Consens State Sci Statements. 2002
Cochrane Database Syst Rev. 2022-1-17
Nat Commun. 2024-5-28
Science. 2023-7-21
Nat Commun. 2022-11-16