方差调整的马氏距离 (VAM)：一种快速准确的细胞特异性基因集评分方法。

Variance-adjusted Mahalanobis (VAM): a fast and accurate method for cell-specific gene set scoring.

机构信息

Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College, Hanover, NH 03755, USA.

出版信息

Nucleic Acids Res. 2020 Sep 18;48(16):e94. doi: 10.1093/nar/gkaa582.

DOI:10.1093/nar/gkaa582

PMID:32633778

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7498348/

Abstract

Statistical analysis of single cell RNA-sequencing (scRNA-seq) data is hindered by high levels of technical noise and inflated zero counts. One promising approach for addressing these challenges is gene set testing, or pathway analysis, which can mitigate sparsity and noise, and improve interpretation and power, by aggregating expression data to the pathway level. Unfortunately, methods optimized for bulk transcriptomics perform poorly on scRNA-seq data and progress on single cell-specific techniques has been limited. Importantly, no existing methods support cell-level gene set inference. To address this challenge, we developed a new gene set testing method, Variance-adjusted Mahalanobis (VAM), that integrates with the Seurat framework and can accommodate the technical noise, sparsity and large sample sizes characteristic of scRNA-seq data. The VAM method computes cell-specific pathway scores to transform a cell-by-gene matrix into a cell-by-pathway matrix that can be used for both data visualization and statistical enrichment analysis. Because the distribution of these scores under the null of uncorrelated technical noise has an accurate gamma approximation, both population and cell-level inference is supported. As demonstrated using simulated and real scRNA-seq data, the VAM method provides superior classification accuracy at a lower computation cost relative to existing single sample gene set testing approaches.

摘要

单细胞 RNA 测序 (scRNA-seq) 数据的统计分析受到高水平技术噪声和膨胀零计数的阻碍。一种有前途的方法是基因集测试或途径分析，通过将表达数据汇总到途径水平，可以减轻稀疏性和噪声，并提高解释和功效。不幸的是，针对批量转录组学优化的方法在 scRNA-seq 数据上表现不佳，单细胞特异性技术的进展受到限制。重要的是，没有现有的方法支持细胞水平的基因集推断。为了解决这个挑战，我们开发了一种新的基因集测试方法，即方差调整的马氏距离 (VAM)，它与 Seurat 框架集成，可以适应 scRNA-seq 数据的技术噪声、稀疏性和大样本量的特点。VAM 方法计算细胞特异性途径得分，将细胞基因矩阵转换为细胞途径矩阵，可用于数据可视化和统计富集分析。由于这些得分在无相关技术噪声的零假设下的分布具有准确的伽马近似，因此支持群体和细胞水平的推断。使用模拟和真实的 scRNA-seq 数据证明，与现有的单样本基因集测试方法相比，VAM 方法在更低的计算成本下提供了更高的分类准确性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b2d/7498348/0ce53150e442/gkaa582fig1.jpg

相似文献

Variance-adjusted Mahalanobis (VAM): a fast and accurate method for cell-specific gene set scoring.方差调整的马氏距离 (VAM)：一种快速准确的细胞特异性基因集评分方法。

Nucleic Acids Res. 2020 Sep 18;48(16):e94. doi: 10.1093/nar/gkaa582.

Latent cellular analysis robustly reveals subtle diversity in large-scale single-cell RNA-seq data.潜伏细胞分析能稳健地揭示大规模单细胞 RNA-seq 数据中的细微多样性。

Nucleic Acids Res. 2019 Dec 16;47(22):e143. doi: 10.1093/nar/gkz826.

Identifying cell states in single-cell RNA-seq data at statistically maximal resolution.以统计学上最大分辨率识别单细胞 RNA-seq 数据中的细胞状态。

PLoS Comput Biol. 2024 Jul 12;20(7):e1012224. doi: 10.1371/journal.pcbi.1012224. eCollection 2024 Jul.

A multitask clustering approach for single-cell RNA-seq analysis in Recessive Dystrophic Epidermolysis Bullosa.一种用于隐性营养不良型大疱性表皮松解症的单细胞 RNA-seq 分析的多任务聚类方法。

PLoS Comput Biol. 2018 Apr 9;14(4):e1006053. doi: 10.1371/journal.pcbi.1006053. eCollection 2018 Apr.

A Comprehensive Survey of Statistical Approaches for Differential Expression Analysis in Single-Cell RNA Sequencing Studies.单细胞 RNA 测序研究中差异表达分析的统计方法综合综述。

Genes (Basel). 2021 Dec 2;12(12):1947. doi: 10.3390/genes12121947.

Visualization of Single Cell RNA-Seq Data Using t-SNE in R.使用 R 中的 t-SNE 可视化单细胞 RNA-Seq 数据。

Methods Mol Biol. 2020;2117:159-167. doi: 10.1007/978-1-0716-0301-7_8.

Single-cell analysis via manifold fitting: A framework for RNA clustering and beyond.单细胞分析通过流形拟合：RNA 聚类及其他。

Proc Natl Acad Sci U S A. 2024 Sep 10;121(37):e2400002121. doi: 10.1073/pnas.2400002121. Epub 2024 Sep 3.

Single-cell RNA-seq denoising using a deep count autoencoder.基于深度计数自编码器的单细胞 RNA-seq 去噪。

Nat Commun. 2019 Jan 23;10(1):390. doi: 10.1038/s41467-018-07931-2.

Single-cell RNA sequencing data imputation using bi-level feature propagation.基于双层特征传播的单细胞 RNA 测序数据插补。

Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae209.

scBoolSeq: Linking scRNA-seq statistics and Boolean dynamics.scBoolSeq：将 scRNA-seq 统计与布尔动力学联系起来。

PLoS Comput Biol. 2024 Jul 8;20(7):e1011620. doi: 10.1371/journal.pcbi.1011620. eCollection 2024 Jul.

引用本文的文献

Gene set optimization for single cell transcriptomics.用于单细胞转录组学的基因集优化

Comput Intell Methods Bioinform Biostat. 2025;15276:183-195. doi: 10.1007/978-3-031-89704-7_14. Epub 2025 May 15.

Integrating microbial GWAS and single-cell transcriptomics reveals associations between host cell populations and the gut microbiome.整合微生物全基因组关联研究和单细胞转录组学揭示宿主细胞群体与肠道微生物组之间的关联。

Nat Microbiol. 2025 May;10(5):1210-1226. doi: 10.1038/s41564-025-01978-w. Epub 2025 Apr 7.

SC-VAR: a computational tool for interpreting polygenic disease risks using single-cell epigenomic data.SC-VAR：一种利用单细胞表观基因组数据解释多基因疾病风险的计算工具。

Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf123.

Uncovering disease-related multicellular pathway modules on large-scale single-cell transcriptomes with scPAFA.利用 scPAFA 在大规模单细胞转录组中揭示与疾病相关的多细胞途径模块。

Commun Biol. 2024 Nov 16;7(1):1523. doi: 10.1038/s42003-024-07238-7.

Leveraging cell type-specificity for gene set analysis of single cell transcriptomics.利用细胞类型特异性进行单细胞转录组学的基因集分析。

bioRxiv. 2024 Sep 27:2024.09.25.615040. doi: 10.1101/2024.09.25.615040.

Benchmarking Algorithms for Gene Set Scoring of Single-cell ATAC-seq Data.单细胞 ATAC-seq 数据基因集评分算法的基准测试。

Genomics Proteomics Bioinformatics. 2024 Jul 3;22(2). doi: 10.1093/gpbjnl/qzae014.

A specialized population of monocyte-derived tracheal macrophages promote airway epithelial regeneration through a CCR2-dependent mechanism.单核细胞衍生的气管巨噬细胞的一个特殊亚群通过一种依赖CCR2的机制促进气道上皮再生。

iScience. 2024 Jun 4;27(7):110169. doi: 10.1016/j.isci.2024.110169. eCollection 2024 Jul 19.

Accurate estimation of pathway activity in single cells for clustering and differential analysis.单细胞通路活性的精确估计用于聚类和差异分析。

Genome Res. 2024 Jul 23;34(6):925-936. doi: 10.1101/gr.278431.123.

CAraCAl: CAMML with the integration of chromatin accessibility.CAraCAl：整合染色质可及性的 CAMML。

BMC Bioinformatics. 2024 Jun 13;25(1):212. doi: 10.1186/s12859-024-05833-3.

irGSEA: the integration of single-cell rank-based gene set enrichment analysis.irGSEA：单细胞基于排名的基因集富集分析的整合。

Brief Bioinform. 2024 May 23;25(4). doi: 10.1093/bib/bbae243.

本文引用的文献

Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression.使用正则化负二项式回归进行单细胞 RNA-seq 数据的归一化和方差稳定化。

Genome Biol. 2019 Dec 23;20(1):296. doi: 10.1186/s13059-019-1874-1.

Functional interpretation of single cell similarity maps.单细胞相似性图谱的功能解释。

Nat Commun. 2019 Sep 26;10(1):4376. doi: 10.1038/s41467-019-12235-0.

Comprehensive Integration of Single-Cell Data.单细胞数据的综合整合。

Cell. 2019 Jun 13;177(7):1888-1902.e21. doi: 10.1016/j.cell.2019.05.031. Epub 2019 Jun 6.

Gliogenesis in the outer subventricular zone promotes enlargement and gyrification of the primate cerebrum.外侧脑室下区的神经发生促进灵长类大脑的增大和脑回形成。

Proc Natl Acad Sci U S A. 2019 Apr 2;116(14):7089-7094. doi: 10.1073/pnas.1822169116. Epub 2019 Mar 20.

RNA velocity of single cells.单细胞 RNA 速度。

Nature. 2018 Aug;560(7719):494-498. doi: 10.1038/s41586-018-0414-6. Epub 2018 Aug 8.

Global characterization of T cells in non-small-cell lung cancer by single-cell sequencing.单细胞测序对非小细胞肺癌 T 细胞的全面刻画。

Nat Med. 2018 Jul;24(7):978-985. doi: 10.1038/s41591-018-0045-3. Epub 2018 Jun 25.

Single-cell profiling of breast cancer T cells reveals a tissue-resident memory subset associated with improved prognosis.单细胞分析乳腺癌 T 细胞揭示了与改善预后相关的组织驻留记忆亚群。

Nat Med. 2018 Jul;24(7):986-993. doi: 10.1038/s41591-018-0078-7. Epub 2018 Jun 25.

Mapping the Mouse Cell Atlas by Microwell-Seq.通过微孔测序绘制小鼠细胞图谱

Cell. 2018 May 17;173(5):1307. doi: 10.1016/j.cell.2018.05.012.

SCENIC: single-cell regulatory network inference and clustering.SCENIC：单细胞调控网络推断与聚类

Nat Methods. 2017 Nov;14(11):1083-1086. doi: 10.1038/nmeth.4463. Epub 2017 Oct 9.

Reversed graph embedding resolves complex single-cell trajectories.反向图嵌入解析复杂的单细胞轨迹。

Nat Methods. 2017 Oct;14(10):979-982. doi: 10.1038/nmeth.4402. Epub 2017 Aug 21.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

方差调整的马氏距离 (VAM)：一种快速准确的细胞特异性基因集评分方法。

Variance-adjusted Mahalanobis (VAM): a fast and accurate method for cell-specific gene set scoring.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献