• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于单细胞 RNA-seq 数据降维、批次整合和可视化的对应分析。

Correspondence analysis for dimension reduction, batch integration, and visualization of single-cell RNA-seq data.

机构信息

Department of Biostatistics, Harvard TH Chan School of Public Health, Boston, MA, USA.

Department of Cancer Immunology and Virology, Dana-Farber Cancer Institute, Boston, MA, USA.

出版信息

Sci Rep. 2023 Jan 21;13(1):1197. doi: 10.1038/s41598-022-26434-1.

DOI:10.1038/s41598-022-26434-1
PMID:36681709
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9867729/
Abstract

Effective dimension reduction is essential for single cell RNA-seq (scRNAseq) analysis. Principal component analysis (PCA) is widely used, but requires continuous, normally-distributed data; therefore, it is often coupled with log-transformation in scRNAseq applications, which can distort the data and obscure meaningful variation. We describe correspondence analysis (CA), a count-based alternative to PCA. CA is based on decomposition of a chi-squared residual matrix, avoiding distortive log-transformation. To address overdispersion and high sparsity in scRNAseq data, we propose five adaptations of CA, which are fast, scalable, and outperform standard CA and glmPCA, to compute cell embeddings with more performant or comparable clustering accuracy in 8 out of 9 datasets. In particular, we find that CA with Freeman-Tukey residuals performs especially well across diverse datasets. Other advantages of the CA framework include visualization of associations between genes and cell populations in a "CA biplot," and extension to multi-table analysis; we introduce corralm for integrative multi-table dimension reduction of scRNAseq data. We implement CA for scRNAseq data in corral, an R/Bioconductor package which interfaces directly with single cell classes in Bioconductor. Switching from PCA to CA is achieved through a simple pipeline substitution and improves dimension reduction of scRNAseq datasets.

摘要

有效的降维对于单细胞 RNA 测序 (scRNAseq) 分析至关重要。主成分分析 (PCA) 被广泛应用,但它需要连续的、正态分布的数据;因此,在 scRNAseq 应用中,它通常与对数转换相结合,这可能会扭曲数据并掩盖有意义的变化。我们描述了对应分析 (CA),这是一种基于计数的 PCA 替代方法。CA 基于卡方残差矩阵的分解,避免了有失真的对数转换。为了解决 scRNAseq 数据中的过分散和高稀疏性问题,我们提出了 CA 的五种适应性方法,这些方法在 9 个数据集的 8 个数据集中,计算细胞嵌入的速度更快、可扩展性更强,并且性能优于标准 CA 和 glmPCA,聚类准确性相当或更高。特别是,我们发现,使用 Freeman-Tukey 残差的 CA 在各种数据集上表现尤其出色。CA 框架的其他优点包括在“CA 双标图”中可视化基因和细胞群体之间的关联,以及扩展到多表分析;我们引入了 corralm,用于整合 scRNAseq 数据的多表降维。我们在 corral 中实现了 CA 用于 scRNAseq 数据,corral 是一个 R/Bioconductor 包,它可以直接与 Bioconductor 中的单细胞类接口。通过简单的流水线替换从 PCA 切换到 CA 可以提高 scRNAseq 数据集的降维效果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1cbe/9867729/5817f6ddbb1e/41598_2022_26434_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1cbe/9867729/3d253090d645/41598_2022_26434_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1cbe/9867729/e635cacd3bf9/41598_2022_26434_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1cbe/9867729/6e67fa28cbc0/41598_2022_26434_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1cbe/9867729/9d017d8419fb/41598_2022_26434_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1cbe/9867729/5817f6ddbb1e/41598_2022_26434_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1cbe/9867729/3d253090d645/41598_2022_26434_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1cbe/9867729/e635cacd3bf9/41598_2022_26434_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1cbe/9867729/6e67fa28cbc0/41598_2022_26434_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1cbe/9867729/9d017d8419fb/41598_2022_26434_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1cbe/9867729/5817f6ddbb1e/41598_2022_26434_Fig5_HTML.jpg

相似文献

1
Correspondence analysis for dimension reduction, batch integration, and visualization of single-cell RNA-seq data.基于单细胞 RNA-seq 数据降维、批次整合和可视化的对应分析。
Sci Rep. 2023 Jan 21;13(1):1197. doi: 10.1038/s41598-022-26434-1.
2
FastRNA: An efficient solution for PCA of single-cell RNA-sequencing data based on a batch-accounting count model.FastRNA:基于批处理计数模型的单细胞 RNA-seq 数据主成分分析的有效解决方案。
Am J Hum Genet. 2022 Nov 3;109(11):1974-1985. doi: 10.1016/j.ajhg.2022.09.008. Epub 2022 Oct 6.
3
Clustering and visualization of single-cell RNA-seq data using path metrics.基于路径测度的单细胞 RNA-seq 数据聚类和可视化。
PLoS Comput Biol. 2024 May 29;20(5):e1012014. doi: 10.1371/journal.pcbi.1012014. eCollection 2024 May.
4
An optimized graph-based structure for single-cell RNA-seq cell-type classification based on non-linear dimension reduction.基于非线性降维的单细胞 RNA-seq 细胞类型分类的优化图结构。
BMC Genomics. 2023 May 2;24(1):227. doi: 10.1186/s12864-023-09344-y.
5
Attention-based deep clustering method for scRNA-seq cell type identification.基于注意力机制的深度聚类方法在 scRNA-seq 细胞类型鉴定中的应用。
PLoS Comput Biol. 2023 Nov 10;19(11):e1011641. doi: 10.1371/journal.pcbi.1011641. eCollection 2023 Nov.
6
K-nearest-neighbors induced topological PCA for single cell RNA-sequence data analysis.K 近邻诱导拓扑主成分分析在单细胞 RNA 测序数据分析中的应用。
Comput Biol Med. 2024 Jun;175:108497. doi: 10.1016/j.compbiomed.2024.108497. Epub 2024 Apr 24.
7
FlowGrid enables fast clustering of very large single-cell RNA-seq data.FlowGrid能够对非常大的单细胞RNA测序数据进行快速聚类。
Bioinformatics. 2021 Dec 22;38(1):282-283. doi: 10.1093/bioinformatics/btab521.
8
Visualization of Single Cell RNA-Seq Data Using t-SNE in R.使用 R 中的 t-SNE 可视化单细胞 RNA-Seq 数据。
Methods Mol Biol. 2020;2117:159-167. doi: 10.1007/978-1-0716-0301-7_8.
9
A fast and efficient count-based matrix factorization method for detecting cell types from single-cell RNAseq data.一种基于计数的快速高效矩阵分解方法,用于从单细胞RNA测序数据中检测细胞类型。
BMC Syst Biol. 2019 Apr 5;13(Suppl 2):28. doi: 10.1186/s12918-019-0699-6.
10
A scalable unsupervised learning of scRNAseq data detects rare cells through integration of structure-preserving embedding, clustering and outlier detection.一种可扩展的无监督学习 scRNAseq 数据检测稀有细胞通过结构保持嵌入、聚类和异常值检测的集成。
Brief Bioinform. 2023 May 19;24(3). doi: 10.1093/bib/bbad125.

引用本文的文献

1
Cellular heterogeneity and therapeutic response profiling of human IDH+ glioma stem cell cultures.人异柠檬酸脱氢酶(IDH)阳性胶质瘤干细胞培养物的细胞异质性和治疗反应分析
bioRxiv. 2025 Aug 1:2025.07.29.667532. doi: 10.1101/2025.07.29.667532.
2
Single-cell RNA seq data analysis reveals molecular markers and possible treatment targets for laryngeal squamous cell carcinoma (LSCC): an in-silico approach.单细胞RNA测序数据分析揭示喉鳞状细胞癌(LSCC)的分子标志物和潜在治疗靶点:一种计算机模拟方法。
In Silico Pharmacol. 2025 Jun 17;13(2):89. doi: 10.1007/s40203-025-00382-w. eCollection 2025.
3
STForte: tissue context-specific encoding and consistency-aware spatial imputation for spatially resolved transcriptomics.

本文引用的文献

1
findPC: An R package to automatically select the number of principal components in single-cell analysis.findPC:一个用于在单细胞分析中自动选择主成分数量的 R 包。
Bioinformatics. 2022 May 13;38(10):2949-2951. doi: 10.1093/bioinformatics/btac235.
2
Benchmarking atlas-level data integration in single-cell genomics.单细胞基因组学中图谱级数据整合的基准测试。
Nat Methods. 2022 Jan;19(1):41-50. doi: 10.1038/s41592-021-01336-8. Epub 2021 Dec 23.
3
Analytic Pearson residuals for normalization of single-cell RNA-seq UMI data.用于单细胞 RNA-seq UMI 数据归一化的解析 Pearson 残差。
STForte:用于空间分辨转录组学的组织上下文特异性编码和一致性感知空间插补
Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf174.
4
Interpretable single-cell factor decomposition using sciRED.使用sciRED进行可解释的单细胞因子分解。
Nat Commun. 2025 Feb 22;16(1):1878. doi: 10.1038/s41467-025-57157-2.
5
Single cell RNA sequencing of haematopoietic cells in fresh and frozen human atheroma tissue.新鲜和冷冻的人类动脉粥样硬化组织中造血细胞的单细胞RNA测序
Cardiovasc Res. 2025 Apr 29;121(3):396-404. doi: 10.1093/cvr/cvaf014.
6
Exploring RNA-Seq Data Analysis Through Visualization Techniques and Tools: A Systematic Review of Opportunities and Limitations for Clinical Applications.通过可视化技术和工具探索RNA测序数据分析:临床应用的机遇与局限的系统综述
Bioengineering (Basel). 2025 Jan 12;12(1):56. doi: 10.3390/bioengineering12010056.
7
Mcadet: A feature selection method for fine-resolution single-cell RNA-seq data based on multiple correspondence analysis and community detection.基于多重对应分析和社区检测的精细分辨率单细胞 RNA-seq 数据特征选择方法
PLoS Comput Biol. 2024 Oct 28;20(10):e1012560. doi: 10.1371/journal.pcbi.1012560. eCollection 2024 Oct.
8
Interpretable single-cell factor decomposition using sciRED.使用sciRED进行可解释的单细胞因子分解。
Res Sq. 2024 Aug 7:rs.3.rs-4819117. doi: 10.21203/rs.3.rs-4819117/v1.
9
Interpretable single-cell factor decomposition using sciRED.使用sciRED进行可解释的单细胞因子分解。
bioRxiv. 2024 Dec 13:2024.08.01.605536. doi: 10.1101/2024.08.01.605536.
10
Systematic analysis on the horse-shoe-like effect in PCA plots of scRNA-seq data.单细胞RNA测序(scRNA-seq)数据主成分分析(PCA)图中马蹄形效应的系统分析。
Bioinform Adv. 2024 Jul 29;4(1):vbae109. doi: 10.1093/bioadv/vbae109. eCollection 2024.
Genome Biol. 2021 Sep 6;22(1):258. doi: 10.1186/s13059-021-02451-7.
4
Impact of Data Preprocessing on Integrative Matrix Factorization of Single Cell Data.数据预处理对单细胞数据整合矩阵分解的影响
Front Oncol. 2020 Jun 23;10:973. doi: 10.3389/fonc.2020.00973. eCollection 2020.
5
Role of S100 proteins in health and disease.S100 蛋白在健康和疾病中的作用。
Biochim Biophys Acta Mol Cell Res. 2020 Jun;1867(6):118677. doi: 10.1016/j.bbamcr.2020.118677. Epub 2020 Feb 11.
6
A benchmark of batch-effect correction methods for single-cell RNA sequencing data.单细胞 RNA 测序数据批次效应校正方法的基准测试。
Genome Biol. 2020 Jan 16;21(1):12. doi: 10.1186/s13059-019-1850-9.
7
Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression.使用正则化负二项式回归进行单细胞 RNA-seq 数据的归一化和方差稳定化。
Genome Biol. 2019 Dec 23;20(1):296. doi: 10.1186/s13059-019-1874-1.
8
Feature selection and dimension reduction for single-cell RNA-Seq based on a multinomial model.基于多项模型的单细胞 RNA-Seq 特征选择和降维。
Genome Biol. 2019 Dec 23;20(1):295. doi: 10.1186/s13059-019-1861-6.
9
Accuracy, robustness and scalability of dimensionality reduction methods for single-cell RNA-seq analysis.单细胞 RNA-seq 分析中降维方法的准确性、鲁棒性和可扩展性。
Genome Biol. 2019 Dec 10;20(1):269. doi: 10.1186/s13059-019-1898-6.
10
Orchestrating single-cell analysis with Bioconductor.使用 Bioconductor 进行单细胞分析的协调。
Nat Methods. 2020 Feb;17(2):137-145. doi: 10.1038/s41592-019-0654-x. Epub 2019 Dec 2.