Suppr超能文献

DiSC:一种用于个体水平单细胞RNA测序数据快速差异表达分析的统计工具。

DiSC: a statistical tool for fast differential expression analysis of individual-level single-cell RNA-seq data.

作者信息

Zhang Lujun, Yang Lu, Ren Yingxue, Zhang Shuwen, Guan Weihua, Chen Jun

机构信息

Division of Biostatistics and Health Data Science, School of Public Health, University of Minnesota, Minneapolis, MN 55455, United States.

Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN 55905, United States.

出版信息

Bioinformatics. 2025 Jun 2;41(6). doi: 10.1093/bioinformatics/btaf327.

Abstract

MOTIVATION

Single-cell RNA sequencing (scRNA-seq) has become an important method for characterizing cellular heterogeneity, revealing more biological insights than the bulk RNA-seq. The surge in scRNA-seq data across multiple individuals calls for efficient and statistically powerful methods for differential expression (DE) analysis that addresses individual-level biological variability.

RESULTS

We introduced DiSC, a method for conducting individual-level DE analysis by extracting multiple distributional characteristics, jointly testing their association with a variable of interest, and using a flexible permutation testing framework to control the false discovery rate (FDR). Our simulation studies demonstrated that DiSC effectively controlled the FDR across various settings and exhibited high statistical power in detecting different types of gene expression changes. Moreover, DiSC is computationally efficient and scalable to the rapidly increasing sample sizes in scRNA-seq studies. When applying DiSC to identify DE genes potentially associated with COVID-19 severity and Alzheimer's disease across various types of peripheral blood mononuclear cells and neural cells, we found that our method was approximately 100 times faster than other state-of-the-art methods and the results were consistent and supported by existing literature. While DiSC was developed for scRNA-seq data, its robust testing framework can also be applied to other types of single-cell data. We applied DiSC to cytometry by time-of-flight data, DiSC identified significantly more DE markers than traditional methods.

AVAILABILITY AND IMPLEMENTATION

The R software package "SingleCellStat" is freely available on CRAN (https://cran.r-project.org/web/packages/SingleCellStat/index.html) and GitHub (https://github.com/Lujun995/DiSC). The replication code for reproducing the analyses in this study is publicly accessible at https://github.com/Lujun995/DiSC_Replication_Code. The scRNA-seq expression matrix and metadata utilized in our simulations and analyses can be retrieved from https://cells.ucsc.edu/autism/rawMatrix.zip, https://cellxgene.cziscience.com/collections/1ca90a2d-2943-483d-b678-b809bf464c30, and https://covid19.cog.sanger.ac.uk/submissions/release1/haniffa21.processed.h5ad.

摘要

动机

单细胞RNA测序(scRNA-seq)已成为表征细胞异质性的重要方法,比批量RNA测序揭示了更多的生物学见解。跨多个个体的scRNA-seq数据激增,需要高效且具有统计效力的差异表达(DE)分析方法来解决个体水平的生物学变异性。

结果

我们引入了DiSC,这是一种通过提取多种分布特征、联合测试它们与感兴趣变量的关联,并使用灵活的置换检验框架来控制错误发现率(FDR),从而进行个体水平DE分析的方法。我们的模拟研究表明,DiSC在各种设置下都能有效控制FDR,并且在检测不同类型的基因表达变化方面具有很高的统计效力。此外,DiSC计算效率高,可扩展到scRNA-seq研究中快速增加的样本量。当应用DiSC在各种类型的外周血单核细胞和神经细胞中识别可能与COVID-19严重程度和阿尔茨海默病相关的DE基因时,我们发现我们的方法比其他现有最先进方法快约100倍,且结果一致并得到现有文献的支持。虽然DiSC是为scRNA-seq数据开发的,但其强大的测试框架也可应用于其他类型的单细胞数据。我们将DiSC应用于飞行时间流式细胞术数据,DiSC识别出的DE标记物比传统方法显著更多。

可用性和实现方式

R软件包“SingleCellStat”可在CRAN(https://cran.r-project.org/web/packages/SingleCellStat/index.html)和GitHub(https://github.com/Lujun995/DiSC)上免费获取。本研究中用于重现分析的复制代码可在https://github.com/Lujun995/DiSC_Replication_Code上公开获取。我们在模拟和分析中使用的scRNA-seq表达矩阵和元数据可从https://cells.ucsc.edu/autism/rawMatrix.zip、https://cellxgene.cziscience.com/collections/1ca90a2d-2943-483d-b678-b809bf464c30和https://covid19.cog.sanger.ac.uk/submissions/release1/haniffa21.processed.h5ad中检索。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验