Suppr超能文献

GSEApy:一个用于在 Python 中进行基因集富集分析的综合软件包。

GSEApy: a comprehensive package for performing gene set enrichment analysis in Python.

机构信息

Department of Anesthesia, Pain and Perioperative Medicine, Stanford University School of Medicine, Stanford, CA 94305, USA.

Department of Otolaryngology-Head and Neck Surgery, Stanford University School of Medicine, Stanford, CA 94305, USA.

出版信息

Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac757.

Abstract

MOTIVATION

Gene set enrichment analysis (GSEA) is a commonly used algorithm for characterizing gene expression changes. However, the currently available tools used to perform GSEA have a limited ability to analyze large datasets, which is particularly problematic for the analysis of single-cell data. To overcome this limitation, we developed a GSEA package in Python (GSEApy), which could efficiently analyze large single-cell datasets.

RESULTS

We present a package (GSEApy) that performs GSEA in either the command line or Python environment. GSEApy uses a Rust implementation to enable it to calculate the same enrichment statistic as GSEA for a collection of pathways. The Rust implementation of GSEApy is 3-fold faster than the Numpy version of GSEApy (v0.10.8) and uses >4-fold less memory. GSEApy also provides an interface between Python and Enrichr web services, as well as for BioMart. The Enrichr application programming interface enables GSEApy to perform over-representation analysis for an input gene list. Furthermore, GSEApy consists of several tools, each designed to facilitate a particular type of enrichment analysis.

AVAILABILITY AND IMPLEMENTATION

The new GSEApy with Rust extension is deposited in PyPI: https://pypi.org/project/gseapy/. The GSEApy source code is freely available at https://github.com/zqfang/GSEApy. Also, the documentation website is available at https://gseapy.rtfd.io/.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

基因集富集分析(GSEA)是一种常用于描述基因表达变化的常用算法。然而,目前用于执行 GSEA 的可用工具对大型数据集的分析能力有限,这对于单细胞数据分析来说尤其成问题。为了克服这一限制,我们在 Python 中开发了一个 GSEA 包(GSEApy),它可以有效地分析大型单细胞数据集。

结果

我们提出了一个包(GSEApy),它可以在命令行或 Python 环境中执行 GSEA。GSEApy 使用 Rust 实现来使其能够计算与 GSEA 相同的富集统计信息,用于一组途径。GSEApy 的 Rust 实现比 GSEApy 的 Numpy 版本(v0.10.8)快 3 倍,使用的内存少 4 倍以上。GSEApy 还提供了 Python 与 Enrichr 网络服务之间的接口,以及与 BioMart 的接口。Enrichr 应用程序编程接口使 GSEApy 能够为输入基因列表执行过表达分析。此外,GSEApy 由几个工具组成,每个工具旨在促进特定类型的富集分析。

可用性和实现

带有 Rust 扩展的新 GSEApy 已在 PyPI 上发布:https://pypi.org/project/gseapy/。GSEApy 的源代码可在 https://github.com/zqfang/GSEApy 上免费获得。此外,文档网站可在 https://gseapy.rtfd.io/ 上获得。

补充信息

补充数据可在生物信息学在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/944c/9805564/9f260dc1fa0c/btac757f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验