• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

为异构计算环境开发可重现的生物信息学分析工作流程,以支持非洲基因组学。

Developing reproducible bioinformatics analysis workflows for heterogeneous computing environments to support African genomics.

机构信息

Department of Digital Technologies, University of Mauritius, Reduit, Mauritius.

Australian Centre for Ancient DNA, University of Adelaide, Adelaide, South Australia, Australia.

出版信息

BMC Bioinformatics. 2018 Nov 29;19(1):457. doi: 10.1186/s12859-018-2446-1.

DOI:10.1186/s12859-018-2446-1
PMID:30486782
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6264621/
Abstract

BACKGROUND

The Pan-African bioinformatics network, H3ABioNet, comprises 27 research institutions in 17 African countries. H3ABioNet is part of the Human Health and Heredity in Africa program (H3Africa), an African-led research consortium funded by the US National Institutes of Health and the UK Wellcome Trust, aimed at using genomics to study and improve the health of Africans. A key role of H3ABioNet is to support H3Africa projects by building bioinformatics infrastructure such as portable and reproducible bioinformatics workflows for use on heterogeneous African computing environments. Processing and analysis of genomic data is an example of a big data application requiring complex interdependent data analysis workflows. Such bioinformatics workflows take the primary and secondary input data through several computationally-intensive processing steps using different software packages, where some of the outputs form inputs for other steps. Implementing scalable, reproducible, portable and easy-to-use workflows is particularly challenging.

RESULTS

H3ABioNet has built four workflows to support (1) the calling of variants from high-throughput sequencing data; (2) the analysis of microbial populations from 16S rDNA sequence data; (3) genotyping and genome-wide association studies; and (4) single nucleotide polymorphism imputation. A week-long hackathon was organized in August 2016 with participants from six African bioinformatics groups, and US and European collaborators. Two of the workflows are built using the Common Workflow Language framework (CWL) and two using Nextflow. All the workflows are containerized for improved portability and reproducibility using Docker, and are publicly available for use by members of the H3Africa consortium and the international research community.

CONCLUSION

The H3ABioNet workflows have been implemented in view of offering ease of use for the end user and high levels of reproducibility and portability, all while following modern state of the art bioinformatics data processing protocols. The H3ABioNet workflows will service the H3Africa consortium projects and are currently in use. All four workflows are also publicly available for research scientists worldwide to use and adapt for their respective needs. The H3ABioNet workflows will help develop bioinformatics capacity and assist genomics research within Africa and serve to increase the scientific output of H3Africa and its Pan-African Bioinformatics Network.

摘要

背景

泛非生物信息学网络 H3ABioNet 由非洲 17 个国家的 27 个研究机构组成。H3ABioNet 是非洲主导的研究联盟人类健康与遗传在非洲(H3Africa)计划的一部分,该计划由美国国立卫生研究院和英国惠康信托基金会资助,旨在利用基因组学研究和改善非洲人的健康。H3ABioNet 的一个关键作用是通过构建生物信息学基础设施来支持 H3Africa 项目,例如可在异构非洲计算环境中使用的便携式和可重复使用的生物信息学工作流程。基因组数据的处理和分析是一个需要复杂相互依赖数据分析工作流程的大数据应用示例。这种生物信息学工作流程将主要和次要输入数据通过几个使用不同软件包的计算密集型处理步骤,并将一些输出作为其他步骤的输入。实现可扩展、可重复、可移植和易于使用的工作流程特别具有挑战性。

结果

H3ABioNet 构建了四个工作流程来支持(1)从高通量测序数据中调用变体;(2)分析 16S rDNA 序列数据中的微生物种群;(3)基因分型和全基因组关联研究;(4)单核苷酸多态性推断。2016 年 8 月组织了为期一周的黑客马拉松活动,来自六个非洲生物信息学小组、美国和欧洲的合作者参加了活动。其中两个工作流程是使用通用工作流程语言框架(CWL)构建的,另外两个是使用 Nextflow 构建的。所有工作流程都使用 Docker 进行容器化,以提高可移植性和可重复性,并可供 H3Africa 联盟成员和国际研究界使用。

结论

H3ABioNet 工作流程的实现考虑了为最终用户提供易用性,以及高度的可重复性和可移植性,同时遵循现代生物信息学数据处理协议的最新状态。H3ABioNet 工作流程将为 H3Africa 联盟项目提供服务,并已在使用中。所有四个工作流程也可供全球研究科学家使用和改编,以满足各自的需求。H3ABioNet 工作流程将有助于发展生物信息学能力,并协助非洲的基因组学研究,并有助于增加 H3Africa 及其泛非生物信息学网络的科学产出。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a214/6264621/3fff11abc71e/12859_2018_2446_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a214/6264621/3ab33f41d488/12859_2018_2446_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a214/6264621/46f3415afc7b/12859_2018_2446_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a214/6264621/67453c4c3a66/12859_2018_2446_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a214/6264621/3fff11abc71e/12859_2018_2446_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a214/6264621/3ab33f41d488/12859_2018_2446_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a214/6264621/46f3415afc7b/12859_2018_2446_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a214/6264621/67453c4c3a66/12859_2018_2446_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a214/6264621/3fff11abc71e/12859_2018_2446_Fig4_HTML.jpg

相似文献

1
Developing reproducible bioinformatics analysis workflows for heterogeneous computing environments to support African genomics.为异构计算环境开发可重现的生物信息学分析工作流程,以支持非洲基因组学。
BMC Bioinformatics. 2018 Nov 29;19(1):457. doi: 10.1186/s12859-018-2446-1.
2
Development of Bioinformatics Infrastructure for Genomics Research.生物信息学基础设施的发展用于基因组学研究。
Glob Heart. 2017 Jun;12(2):91-98. doi: 10.1016/j.gheart.2017.01.005. Epub 2017 Mar 13.
3
Organizing and running bioinformatics hackathons within Africa: The H3ABioNet cloud computing experience.在非洲组织和举办生物信息学黑客马拉松:H3ABioNet云计算经验。
AAS Open Res. 2019 Aug 7;1:9. doi: 10.12688/aasopenres.12847.2. eCollection 2018.
4
H3ABioNet, a sustainable pan-African bioinformatics network for human heredity and health in Africa.H3ABioNet,一个致力于非洲人类遗传与健康的可持续泛非生物信息学网络。
Genome Res. 2016 Feb;26(2):271-7. doi: 10.1101/gr.196295.115. Epub 2015 Dec 1.
5
Assessing computational genomics skills: Our experience in the H3ABioNet African bioinformatics network.评估计算基因组学技能:我们在H3ABioNet非洲生物信息学网络中的经验。
PLoS Comput Biol. 2017 Jun 1;13(6):e1005419. doi: 10.1371/journal.pcbi.1005419. eCollection 2017 Jun.
6
Genomics and bioinformatics capacity in Africa: no continent is left behind.非洲的基因组学和生物信息学能力:没有一个大陆被落下。
Genome. 2021 May;64(5):503-513. doi: 10.1139/gen-2020-0013. Epub 2021 Jan 12.
7
The H3ABioNet helpdesk: an online bioinformatics resource, enhancing Africa's capacity for genomics research.H3ABioNet 服务台:一个在线生物信息学资源,增强了非洲进行基因组学研究的能力。
BMC Bioinformatics. 2019 Dec 30;20(1):741. doi: 10.1186/s12859-019-3322-3.
8
Scalable Workflows and Reproducible Data Analysis for Genomics.基因组学的可扩展工作流程和可重复数据分析
Methods Mol Biol. 2019;1910:723-745. doi: 10.1007/978-1-4939-9074-0_24.
9
ILIAD: a suite of automated Snakemake workflows for processing genomic data for downstream applications.ILIAD:一套用于处理基因组数据以用于下游应用的自动化 Snakemake 工作流程套件。
BMC Bioinformatics. 2023 Nov 8;24(1):424. doi: 10.1186/s12859-023-05548-x.
10
Developing expertise in bioinformatics for biomedical research in Africa.在非洲发展用于生物医学研究的生物信息学专业技能。
Appl Transl Genom. 2015 Sep;6:31-34. doi: 10.1016/j.atg.2015.10.002.

引用本文的文献

1
Aberrant One-Carbon Metabolism and Ancestral Genetics Underlie Edematous Severe Acute Malnutrition.异常的一碳代谢和祖传基因是水肿型重度急性营养不良的基础。
Res Sq. 2025 Jun 29:rs.3.rs-6890799. doi: 10.21203/rs.3.rs-6890799/v1.
2
The role of automation in enhancing reproducibility and interoperability of PBPK models.自动化在提高生理药代动力学(PBPK)模型的可重复性和互操作性方面的作用。
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbaf053.
3
Current data science capacity building initiatives for health researchers in LMICs: global & regional efforts.

本文引用的文献

1
H3Africa AWI-Gen Collaborative Centre: a resource to study the interplay between genomic and environmental risk factors for cardiometabolic diseases in four sub-Saharan African countries.H3Africa AWI-Gen合作中心:一个用于研究撒哈拉以南非洲四个国家心脏代谢疾病的基因组与环境风险因素之间相互作用的资源。
Glob Health Epidemiol Genom. 2016 Nov 22;1:e20. doi: 10.1017/gheg.2016.17. eCollection 2016.
2
Nextflow enables reproducible computational workflows.Nextflow支持可重复的计算工作流程。
Nat Biotechnol. 2017 Apr 11;35(4):316-319. doi: 10.1038/nbt.3820.
3
Use of application containers and workflows for genomic data analysis.
低收入和中等收入国家卫生研究人员当前的数据科学能力建设举措:全球和区域努力。
Front Public Health. 2024 Nov 27;12:1418382. doi: 10.3389/fpubh.2024.1418382. eCollection 2024.
4
Cancer treatment comes to age: from one-size-fits-all to next-generation sequencing (NGS) technologies.癌症治疗走向成熟:从一刀切到新一代测序(NGS)技术。
Bioimpacts. 2024;14(4):29957. doi: 10.34172/bi.2023.29957. Epub 2023 Dec 23.
5
Genetic association and transferability for urinary albumin-creatinine ratio as a marker of kidney disease in four Sub-Saharan African populations and non-continental individuals of African ancestry.在四个撒哈拉以南非洲人群以及具有非洲血统的非非洲大陆个体中,尿白蛋白肌酐比值作为肾脏疾病标志物的遗传关联性及可转移性研究
Front Genet. 2024 May 15;15:1372042. doi: 10.3389/fgene.2024.1372042. eCollection 2024.
6
Genome-wide association study of population-standardised cognitive performance phenotypes in a rural South African community.全基因组关联研究在南非农村社区人群标准化认知表现表型中的应用。
Commun Biol. 2023 Mar 27;6(1):328. doi: 10.1038/s42003-023-04636-1.
7
H3AGWAS: a portable workflow for genome wide association studies.H3AGWAS:全基因组关联研究的便携式工作流程。
BMC Bioinformatics. 2022 Nov 19;23(1):498. doi: 10.1186/s12859-022-05034-w.
8
The importance of increasing population diversity in genetic studies of type 2 diabetes and related glycaemic traits.增加 2 型糖尿病和相关血糖特征的遗传研究中的人口多样性的重要性。
Diabetologia. 2021 Dec;64(12):2653-2664. doi: 10.1007/s00125-021-05575-4. Epub 2021 Sep 30.
9
BIGwas: Single-command quality control and association testing for multi-cohort and biobank-scale GWAS/PheWAS data.BIGwas:用于多队列和生物库规模 GWAS/PheWAS 数据的单命令质量控制和关联测试。
Gigascience. 2021 Jun 29;10(6). doi: 10.1093/gigascience/giab047.
10
H3ABioNet genomic medicine and microbiome data portals hackathon proceedings.H3ABioNet 基因组医学和微生物组数据门户黑客马拉松会议记录。
Database (Oxford). 2021 Apr 17;2021. doi: 10.1093/database/baab016.
应用容器和工作流程在基因组数据分析中的应用。
J Pathol Inform. 2016 Dec 30;7:53. doi: 10.4103/2153-3539.197197. eCollection 2016.
4
Application of Whole Exome Sequencing in the Clinical Diagnosis and Management of Inherited Cardiovascular Diseases in Adults.全外显子组测序在成人遗传性心血管疾病临床诊断与管理中的应用
Circ Cardiovasc Genet. 2017 Feb;10(1). doi: 10.1161/CIRCGENETICS.116.001573.
5
RABIX: AN OPEN-SOURCE WORKFLOW EXECUTOR SUPPORTING RECOMPUTABILITY AND INTEROPERABILITY OF WORKFLOW DESCRIPTIONS.RABIX:一个支持工作流描述的可重新计算性和互操作性的开源工作流执行器。
Pac Symp Biocomput. 2017;22:154-165. doi: 10.1142/9789813207813_0016.
6
Automated quality control for genome wide association studies.全基因组关联研究的自动化质量控制
F1000Res. 2016 Jul 29;5:1889. doi: 10.12688/f1000research.9271.1. eCollection 2016.
7
The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update.用于可访问、可重复和协作式生物医学分析的Galaxy平台:2016年更新
Nucleic Acids Res. 2016 Jul 8;44(W1):W3-W10. doi: 10.1093/nar/gkw343. Epub 2016 May 2.
8
A review of bioinformatic pipeline frameworks.生物信息学流程框架综述。
Brief Bioinform. 2017 May 1;18(3):530-536. doi: 10.1093/bib/bbw020.
9
H3ABioNet, a sustainable pan-African bioinformatics network for human heredity and health in Africa.H3ABioNet,一个致力于非洲人类遗传与健康的可持续泛非生物信息学网络。
Genome Res. 2016 Feb;26(2):271-7. doi: 10.1101/gr.196295.115. Epub 2015 Dec 1.
10
From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline.从FastQ数据到高可信度变异检测:基因组分析工具包最佳实践流程
Curr Protoc Bioinformatics. 2013;43(1110):11.10.1-11.10.33. doi: 10.1002/0471250953.bi1110s43.