• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Genal:一个用于遗传风险评分和孟德尔随机化的Python工具包。

Genal: a Python toolkit for genetic risk scoring and Mendelian randomization.

作者信息

Rivier Cyprien A, Clocchiatti-Tuozzo Santiago, Huo Shufan, Torres-Lopez Victor, Renedo Daniela, Sheth Kevin N, Falcone Guido J, Acosta Julian N

机构信息

Department of Neurology, Yale School of Medicine, New Haven, CT 06510, United States.

Yale Center for Brain and Mind Health, Yale School of Medicine, New Haven, CT 06510, United States.

出版信息

Bioinform Adv. 2024 Dec 24;5(1):vbae207. doi: 10.1093/bioadv/vbae207. eCollection 2025.

DOI:10.1093/bioadv/vbae207
PMID:39776894
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11706532/
Abstract

MOTIVATION

The expansion of genetic association data from genome-wide association studies has increased the importance of methodologies like Polygenic Risk Scores (PRS) and Mendelian Randomization (MR) in genetic epidemiology. However, their application is often impeded by complex, multi-step workflows requiring specialized expertise and the use of disparate tools with varying data formatting requirements. Existing solutions are frequently standalone packages or command-line based-largely due to dependencies on tools like PLINK-limiting accessibility for researchers without computational experience. Given Python's popularity and ease of use, there is a need for an integrated, user-friendly Python toolkit to streamline PRS and MR analyses.

RESULTS

We introduce Genal, a Python package that consolidates SNP-level data handling, cleaning, clumping, PRS computation, and MR analyses into a single, cohesive toolkit. By eliminating the need for multiple R packages and for command-line interaction by wrapping around PLINK, Genal lowers the barrier for medical scientists to perform complex genetic epidemiology studies. Genal draws on concepts from several well-established tools, ensuring that users have access to rigorous statistical techniques in the intuitive Python environment. Additionally, Genal leverages parallel processing for MR methods, including MR-PRESSO, significantly reducing the computational time required for these analyses.

AVAILABILITY AND IMPLEMENTATION

The package is available on Pypi (https://pypi.org/project/genal-python/), the code is openly available on Github with a tutorial: https://github.com/CypRiv/genal, and the documentation can be found on readthedocs: https://genal.rtfd.io.

摘要

动机

全基因组关联研究中遗传关联数据的扩展,增加了多基因风险评分(PRS)和孟德尔随机化(MR)等方法在遗传流行病学中的重要性。然而,它们的应用常常受到复杂的多步骤工作流程的阻碍,这些流程需要专业知识,并且要使用具有不同数据格式要求的不同工具。现有的解决方案通常是独立的软件包或基于命令行的——这主要是由于对像PLINK这样的工具的依赖——限制了没有计算经验的研究人员的可及性。鉴于Python的普及性和易用性,需要一个集成的、用户友好的Python工具包来简化PRS和MR分析。

结果

我们推出了Genal,一个Python软件包,它将单核苷酸多态性(SNP)水平的数据处理、清理、聚类、PRS计算和MR分析整合到一个统一的工具包中。通过消除对多个R软件包的需求,并通过围绕PLINK进行包装来避免命令行交互,Genal降低了医学科学家进行复杂遗传流行病学研究的门槛。Genal借鉴了几个成熟工具的概念,确保用户能够在直观的Python环境中使用严格的统计技术。此外,Genal对包括MR-PRESSO在内的MR方法利用了并行处理,显著减少了这些分析所需的计算时间。

可用性和实现方式

该软件包可在Pypi(https://pypi.org/project/genal-python/)上获取,代码在Github上公开可用,并带有教程:https://github.com/CypRiv/genal,文档可在readthedocs上找到:https://genal.rtfd.io 。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1107/11706532/09cfeac686b0/vbae207f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1107/11706532/36abb61c2753/vbae207f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1107/11706532/5ba570405cb1/vbae207f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1107/11706532/09cfeac686b0/vbae207f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1107/11706532/36abb61c2753/vbae207f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1107/11706532/5ba570405cb1/vbae207f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1107/11706532/09cfeac686b0/vbae207f3.jpg

相似文献

1
Genal: a Python toolkit for genetic risk scoring and Mendelian randomization.Genal:一个用于遗传风险评分和孟德尔随机化的Python工具包。
Bioinform Adv. 2024 Dec 24;5(1):vbae207. doi: 10.1093/bioadv/vbae207. eCollection 2025.
2
NeuroPycon: An open-source python toolbox for fast multi-modal and reproducible brain connectivity pipelines.NeuroPycon:一个开源的 Python 工具包,用于快速进行多模态和可重复的脑连接管道。
Neuroimage. 2020 Oct 1;219:117020. doi: 10.1016/j.neuroimage.2020.117020. Epub 2020 Jun 6.
3
pyrpipe: a Python package for RNA-Seq workflows.pyrpipe:一个用于RNA测序工作流程的Python软件包。
NAR Genom Bioinform. 2021 Jun 1;3(2):lqab049. doi: 10.1093/nargab/lqab049. eCollection 2021 Jun.
4
GSEApy: a comprehensive package for performing gene set enrichment analysis in Python.GSEApy:一个用于在 Python 中进行基因集富集分析的综合软件包。
Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac757.
5
PxBLAT: An efficient python binding library for BLAT.PxBLAT:一个用于BLAT的高效Python绑定库。
bioRxiv. 2024 Feb 5:2023.08.02.551686. doi: 10.1101/2023.08.02.551686.
6
Introducing GWAStic: a user-friendly, cross-platform solution for genome-wide association studies and genomic prediction.介绍GWAStic:一种用于全基因组关联研究和基因组预测的用户友好型跨平台解决方案。
Bioinform Adv. 2024 Nov 12;4(1):vbae177. doi: 10.1093/bioadv/vbae177. eCollection 2024.
7
plotnineSeqSuite: a Python package for visualizing sequence data using ggplot2 style.plotnineSeqSuite:一个使用 ggplot2 风格可视化序列数据的 Python 包。
BMC Genomics. 2023 Oct 3;24(1):585. doi: 10.1186/s12864-023-09677-8.
8
PyHMMER: a Python library binding to HMMER for efficient sequence analysis.PyHMMER:一个绑定到 HMMER 的 Python 库,用于高效的序列分析。
Bioinformatics. 2023 May 4;39(5). doi: 10.1093/bioinformatics/btad214.
9
Genopyc: a Python library for investigating the functional effects of genomic variants associated to complex diseases.Genopyc:一个用于研究与复杂疾病相关的基因组变异的功能影响的 Python 库。
Bioinformatics. 2024 Jun 3;40(6). doi: 10.1093/bioinformatics/btae379.
10
blast2galaxy: a CLI and Python API for BLAST+ and DIAMOND searches on Galaxy servers.blast2galaxy:用于在Galaxy服务器上进行BLAST+和DIAMOND搜索的命令行界面和Python应用程序编程接口。
Bioinform Adv. 2024 Nov 22;4(1):vbae185. doi: 10.1093/bioadv/vbae185. eCollection 2024.

引用本文的文献

1
Bidirectional relationship between epigenetic age and stroke, dementia, and late-life depression.表观遗传年龄与中风、痴呆和老年期抑郁症之间的双向关系。
Nat Commun. 2025 Feb 1;16(1):1261. doi: 10.1038/s41467-024-54721-0.

本文引用的文献

1
The Polygenic Risk Score Knowledge Base offers a centralized online repository for calculating and contextualizing polygenic risk scores.多基因风险评分知识库为计算和情境化多基因风险评分提供了一个集中的在线存储库。
Commun Biol. 2022 Sep 2;5(1):899. doi: 10.1038/s42003-022-03795-x.
2
Mendelian Randomization: Concepts and Scope.孟德尔随机化:概念与范围。
Cold Spring Harb Perspect Med. 2022 Jan 4;12(1):a040501. doi: 10.1101/cshperspect.a040501.
3
Polygenic risk scores: from research tools to clinical instruments.多基因风险评分:从研究工具到临床工具。
Genome Med. 2020 May 18;12(1):44. doi: 10.1186/s13073-020-00742-5.
4
The MR-Base platform supports systematic causal inference across the human phenome.MR-Base 平台支持在人类表型全范围内进行系统因果推断。
Elife. 2018 May 30;7:e34408. doi: 10.7554/eLife.34408.
5
Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases.检测复杂性状和疾病之间的孟德尔随机化因果关系推断中广泛存在的水平 pleiotropy。
Nat Genet. 2018 May;50(5):693-698. doi: 10.1038/s41588-018-0099-7. Epub 2018 Apr 23.
6
Orienting the causal relationship between imprecisely measured traits using GWAS summary data.利用全基因组关联研究(GWAS)汇总数据确定测量不准确的性状之间的因果关系。
PLoS Genet. 2017 Nov 17;13(11):e1007081. doi: 10.1371/journal.pgen.1007081. eCollection 2017 Nov.
7
PLINK: a tool set for whole-genome association and population-based linkage analyses.PLINK:一个用于全基因组关联分析和基于群体的连锁分析的工具集。
Am J Hum Genet. 2007 Sep;81(3):559-75. doi: 10.1086/519795. Epub 2007 Jul 25.