• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

nf-gwas流程:一种用于全基因组关联研究的Nextflow流程。

nf-gwas-pipeline: A Nextflow Genome-Wide Association Study Pipeline.

作者信息

Song Zeyuan, Gurinovich Anastasia, Federico Anthony, Monti Stefano, Sebastiani Paola

机构信息

Department of Biostatistics, Boston University School of Public Health, 801 Massachusetts Avenue 3rd Floor, Boston, MA 02218, USA.

Section of Computational Biomedicine, Boston University School of Medicine, 72 East Concord St., Boston, MA 02218, USA.

出版信息

J Open Source Softw. 2021;6(59). doi: 10.21105/joss.02957. Epub 2021 Mar 2.

DOI:10.21105/joss.02957
PMID:35647481
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9137404/
Abstract

A tool for conducting Genome-Wide Association Study (GWAS) in a systematic, automated and reproducible manner is overdue. We developed an automated GWAS pipeline by combining multiple analysis tools - including bcftools, vcftools, the R packages SNPRelate/GENESIS/GMMAT and ANNOVAR - through Nextflow, which is a portable, flexible, and reproducible reactive workflow framework for developing pipelines. The GWAS pipeline integrates the steps of data quality control and assessment and genetic association analyses, including analysis of cross-sectional and longitudinal studies with either single variants or gene-based tests, into a unified analysis workflow. The pipeline is implemented in Nextflow, dependencies are distributed through Docker, and the code is publicly available on Github.

摘要

目前迫切需要一种能够以系统、自动化且可重复的方式进行全基因组关联研究(GWAS)的工具。我们通过Nextflow将多个分析工具(包括bcftools、vcftools、R包SNPRelate/GENESIS/GMMAT和ANNOVAR)组合在一起,开发了一个自动化的GWAS流程,Nextflow是一个用于开发流程的便携式、灵活且可重复的反应式工作流框架。该GWAS流程将数据质量控制与评估步骤以及遗传关联分析(包括使用单变体或基于基因的测试对横断面和纵向研究进行分析)整合到一个统一的分析工作流中。该流程在Nextflow中实现,依赖项通过Docker分发,代码在Github上公开可用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e7a3/9137404/154d5a42bd30/nihms-1778771-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e7a3/9137404/154d5a42bd30/nihms-1778771-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e7a3/9137404/154d5a42bd30/nihms-1778771-f0001.jpg

相似文献

1
nf-gwas-pipeline: A Nextflow Genome-Wide Association Study Pipeline.nf-gwas流程:一种用于全基因组关联研究的Nextflow流程。
J Open Source Softw. 2021;6(59). doi: 10.21105/joss.02957. Epub 2021 Mar 2.
2
BIGwas: Single-command quality control and association testing for multi-cohort and biobank-scale GWAS/PheWAS data.BIGwas:用于多队列和生物库规模 GWAS/PheWAS 数据的单命令质量控制和关联测试。
Gigascience. 2021 Jun 29;10(6). doi: 10.1093/gigascience/giab047.
3
Performing highly parallelized and reproducible GWAS analysis on biobank-scale data.对生物样本库规模的数据进行高度并行且可重复的全基因组关联研究(GWAS)分析。
NAR Genom Bioinform. 2024 Feb 7;6(1):lqae015. doi: 10.1093/nargab/lqae015. eCollection 2024 Mar.
4
H3AGWAS: a portable workflow for genome wide association studies.H3AGWAS:全基因组关联研究的便携式工作流程。
BMC Bioinformatics. 2022 Nov 19;23(1):498. doi: 10.1186/s12859-022-05034-w.
5
nf-core/nanostring: a pipeline for reproducible NanoString nCounter analysis.nf-core/nanostring:用于可重复的 NanoString nCounter 分析的流水线。
Bioinformatics. 2024 Jan 2;40(1). doi: 10.1093/bioinformatics/btae019.
6
scalepopgen: Bioinformatic Workflow Resources Implemented in Nextflow for Comprehensive Population Genomic Analyses.scalepopgen:在 Nextflow 中实现的用于全面群体基因组分析的生物信息学工作流程资源。
Mol Biol Evol. 2024 Apr 2;41(4). doi: 10.1093/molbev/msae057.
7
NFTest: automated testing of Nextflow pipelines.NFTest:用于 Nextflow 管道的自动化测试。
Bioinformatics. 2024 Feb 1;40(2). doi: 10.1093/bioinformatics/btae081.
8
snpQT: flexible, reproducible, and comprehensive quality control and imputation of genomic data.snpQT:用于基因组数据的灵活、可重复和全面的质量控制和 imputation。
F1000Res. 2021 Jul 14;10:567. doi: 10.12688/f1000research.53821.2. eCollection 2021.
9
nf-core/clipseq - a robust Nextflow pipeline for comprehensive CLIP data analysis.nf-core/clipseq - 一个用于全面CLIP数据分析的强大的Nextflow工作流程。
Wellcome Open Res. 2023 Jul 4;8:286. doi: 10.12688/wellcomeopenres.19453.1. eCollection 2023.
10
nf-rnaSeqCount: A Nextflow pipeline for obtaining raw read counts from RNA-seq data.nf-rnaSeqCount:一个用于从RNA测序数据中获取原始读取计数的Nextflow管道。
S Afr Comput J. 2021 Dec;33(2). doi: 10.18489/sacj.v33i2.830. Epub 2021 Dec 20.

引用本文的文献

1
SAGA (Simplified Association Genomewide Analyses): a user-friendly Pipeline to Democratize Genome-Wide Association Studies.SAGA(简化全基因组关联分析):一种使全基因组关联研究普及化的用户友好型流程。
bioRxiv. 2025 Aug 29:2025.08.25.672146. doi: 10.1101/2025.08.25.672146.
2
Pharmacogenomics of steroid-induced ocular hypertension: relationship to high-tension glaucomas and new pathophysiologic insight.类固醇性高眼压症的药物基因组学:与高眼压型青光眼的关系及新的病理生理学见解
medRxiv. 2025 Aug 13:2025.08.11.25333245. doi: 10.1101/2025.08.11.25333245.
3
Whole blood transcriptional signatures of age and survival identified in Long Life Family and Integrative Longevity Omics Studies.

本文引用的文献

1
Genetic association testing using the GENESIS R/Bioconductor package.使用 GENESIS R/Bioconductor 包进行遗传关联测试。
Bioinformatics. 2019 Dec 15;35(24):5346-5348. doi: 10.1093/bioinformatics/btz567.
2
Odyssey: a semi-automated pipeline for phasing, imputation, and analysis of genome-wide genetic data.奥德赛:一个用于全基因组遗传数据相位、插补和分析的半自动流水线。
BMC Bioinformatics. 2019 Jun 28;20(1):364. doi: 10.1186/s12859-019-2964-5.
3
Benefits and limitations of genome-wide association studies.全基因组关联研究的优势和局限性。
在长寿家族和综合长寿组学研究中确定的年龄和生存的全血转录特征。
bioRxiv. 2025 Jul 18:2025.07.15.664976. doi: 10.1101/2025.07.15.664976.
4
Integrating Artificial Intelligence in Next-Generation Sequencing: Advances, Challenges, and Future Directions.将人工智能整合到下一代测序中:进展、挑战与未来方向。
Curr Issues Mol Biol. 2025 Jun 19;47(6):470. doi: 10.3390/cimb47060470.
5
Assessment of the functionality and usability of open-source rare variant analysis pipelines.开源罕见变异分析流程的功能与可用性评估。
Brief Bioinform. 2025 Feb 5;26(1). doi: 10.1093/bib/bbaf044.
6
SNP rs6543176 is associated with extreme human longevity but increased risk for cancer.单核苷酸多态性rs6543176与人类的极端长寿相关,但会增加患癌风险。
Geroscience. 2025 Jan 3. doi: 10.1007/s11357-024-01478-5.
7
Metabolite signatures of chronological age, aging, survival, and longevity.年龄、衰老、生存和长寿的代谢特征。
Cell Rep. 2024 Nov 26;43(11):114913. doi: 10.1016/j.celrep.2024.114913. Epub 2024 Nov 5.
8
yQTL Pipeline: A structured computational workflow for large scale quantitative trait loci discovery and downstream visualization.yQTL Pipeline:一种用于大规模数量性状基因座发现和下游可视化的结构化计算工作流程。
PLoS One. 2024 Jun 4;19(6):e0298501. doi: 10.1371/journal.pone.0298501. eCollection 2024.
9
COSGAP: COntainerized Statistical Genetics Analysis Pipelines.COSGAP:容器化统计遗传学分析流程
Bioinform Adv. 2024 May 9;4(1):vbae067. doi: 10.1093/bioadv/vbae067. eCollection 2024.
10
yQTL Pipeline: a structured computational workflow for large scale quantitative trait loci discovery and downstream visualization.yQTL管道:一种用于大规模数量性状基因座发现及下游可视化的结构化计算工作流程。
bioRxiv. 2024 Jan 30:2024.01.26.577518. doi: 10.1101/2024.01.26.577518.
Nat Rev Genet. 2019 Aug;20(8):467-484. doi: 10.1038/s41576-019-0127-1.
4
Nextflow enables reproducible computational workflows.Nextflow支持可重复的计算工作流程。
Nat Biotechnol. 2017 Apr 11;35(4):316-319. doi: 10.1038/nbt.3820.
5
SeqArray-a storage-efficient high-performance data format for WGS variant calls.SeqArray——一种用于全基因组测序变异检测的存储高效的高性能数据格式。
Bioinformatics. 2017 Aug 1;33(15):2251-2257. doi: 10.1093/bioinformatics/btx145.
6
Control for Population Structure and Relatedness for Binary Traits in Genetic Association Studies via Logistic Mixed Models.通过逻辑混合模型在遗传关联研究中对二元性状的群体结构和相关性进行控制。
Am J Hum Genet. 2016 Apr 7;98(4):653-66. doi: 10.1016/j.ajhg.2016.02.012. Epub 2016 Mar 24.
7
Model-free Estimation of Recent Genetic Relatedness.近期遗传相关性的无模型估计
Am J Hum Genet. 2016 Jan 7;98(1):127-48. doi: 10.1016/j.ajhg.2015.11.022.
8
Robust inference of population structure for ancestry prediction and correction of stratification in the presence of relatedness.在存在亲缘关系的情况下,对群体结构进行稳健推断,以进行血统预测和分层校正。
Genet Epidemiol. 2015 May;39(4):276-93. doi: 10.1002/gepi.21896. Epub 2015 Mar 23.