• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于半监督主成分分析的单细胞 RNA-seq 数据可视化

Visualizing Single-Cell RNA-seq Data with Semisupervised Principal Component Analysis.

机构信息

Department of Public Health Sciences, Penn State College of Medicine, Hershey, PA 17033, USA.

出版信息

Int J Mol Sci. 2020 Aug 12;21(16):5797. doi: 10.3390/ijms21165797.

DOI:10.3390/ijms21165797
PMID:32806757
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7460854/
Abstract

Single-cell RNA-seq (scRNA-seq) is a powerful tool for analyzing heterogeneous and functionally diverse cell population. Visualizing scRNA-seq data can help us effectively extract meaningful biological information and identify novel cell subtypes. Currently, the most popular methods for scRNA-seq visualization are principal component analysis (PCA) and t-distributed stochastic neighbor embedding (t-SNE). While PCA is an unsupervised dimension reduction technique, t-SNE incorporates cluster information into pairwise probability, and then maximizes the Kullback-Leibler divergence. Uniform Manifold Approximation and Projection (UMAP) is another recently developed visualization method similar to t-SNE. However, one limitation with UMAP and t-SNE is that they can only capture the local structure of the data, the global structure of the data is not faithfully preserved. In this manuscript, we propose a semisupervised principal component analysis (ssPCA) approach for scRNA-seq visualization. The proposed approach incorporates cluster-labels into dimension reduction and discovers principal components that maximize both data variance and cluster dependence. ssPCA must have cluster-labels as its input. Therefore, it is most useful for visualizing clusters from a scRNA-seq clustering software. Our experiments with simulation and real scRNA-seq data demonstrate that ssPCA is able to preserve both local and global structures of the data, and uncover the transition and progressions in the data, if they exist. In addition, ssPCA is convex and has a global optimal solution. It is also robust and computationally efficient, making it viable for scRNA-seq cluster visualization.

摘要

单细胞 RNA 测序 (scRNA-seq) 是分析异质和功能多样化细胞群体的强大工具。可视化 scRNA-seq 数据可以帮助我们有效地提取有意义的生物学信息并识别新的细胞亚型。目前,scRNA-seq 可视化最流行的方法是主成分分析 (PCA) 和 t 分布随机邻域嵌入 (t-SNE)。虽然 PCA 是一种无监督降维技术,但 t-SNE 将聚类信息纳入成对概率中,然后最大化 Kullback-Leibler 散度。Uniform Manifold Approximation and Projection (UMAP) 是另一种最近开发的类似于 t-SNE 的可视化方法。然而,UMAP 和 t-SNE 的一个局限性是它们只能捕获数据的局部结构,而不能忠实地保留数据的全局结构。在本文中,我们提出了一种用于 scRNA-seq 可视化的半监督主成分分析 (ssPCA) 方法。该方法将聚类标签纳入降维过程中,并发现最大化数据方差和聚类依赖性的主成分。ssPCA 必须以聚类标签作为输入。因此,它最适合用于可视化 scRNA-seq 聚类软件中的聚类。我们使用模拟和真实 scRNA-seq 数据进行的实验表明,ssPCA 能够保留数据的局部和全局结构,如果存在的话,还能够揭示数据的转变和进展。此外,ssPCA 是凸的并且具有全局最优解。它还具有鲁棒性和计算效率,使其适用于 scRNA-seq 聚类可视化。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f5ac/7460854/5f549bb8b7a2/ijms-21-05797-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f5ac/7460854/372f0aa20aec/ijms-21-05797-g0A1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f5ac/7460854/797d67cc80be/ijms-21-05797-g0A2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f5ac/7460854/0d267564c441/ijms-21-05797-g0A3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f5ac/7460854/52076adcfdb0/ijms-21-05797-g0A4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f5ac/7460854/e7ff424bbbe8/ijms-21-05797-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f5ac/7460854/c213fb89ae38/ijms-21-05797-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f5ac/7460854/5f549bb8b7a2/ijms-21-05797-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f5ac/7460854/372f0aa20aec/ijms-21-05797-g0A1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f5ac/7460854/797d67cc80be/ijms-21-05797-g0A2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f5ac/7460854/0d267564c441/ijms-21-05797-g0A3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f5ac/7460854/52076adcfdb0/ijms-21-05797-g0A4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f5ac/7460854/e7ff424bbbe8/ijms-21-05797-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f5ac/7460854/c213fb89ae38/ijms-21-05797-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f5ac/7460854/5f549bb8b7a2/ijms-21-05797-g003.jpg

相似文献

1
Visualizing Single-Cell RNA-seq Data with Semisupervised Principal Component Analysis.基于半监督主成分分析的单细胞 RNA-seq 数据可视化
Int J Mol Sci. 2020 Aug 12;21(16):5797. doi: 10.3390/ijms21165797.
2
Visualization of Single Cell RNA-Seq Data Using t-SNE in R.使用 R 中的 t-SNE 可视化单细胞 RNA-Seq 数据。
Methods Mol Biol. 2020;2117:159-167. doi: 10.1007/978-1-0716-0301-7_8.
3
Dimensionality Reduction of Single-Cell RNA-Seq Data.单细胞 RNA-Seq 数据的降维处理。
Methods Mol Biol. 2021;2284:331-342. doi: 10.1007/978-1-0716-1307-8_18.
4
Preprocessing of Single Cell RNA Sequencing Data Using Correlated Clustering and Projection.单细胞 RNA 测序数据的相关聚类和投影预处理。
J Chem Inf Model. 2024 Apr 8;64(7):2829-2838. doi: 10.1021/acs.jcim.3c00674. Epub 2023 Jul 4.
5
DGCyTOF: Deep learning with graphic cluster visualization to predict cell types of single cell mass cytometry data.DGCyTOF:基于图形聚类可视化的深度学习,用于预测单细胞质谱流式细胞术数据的细胞类型。
PLoS Comput Biol. 2022 Apr 11;18(4):e1008885. doi: 10.1371/journal.pcbi.1008885. eCollection 2022 Apr.
6
K-nearest-neighbors induced topological PCA for single cell RNA-sequence data analysis.K 近邻诱导拓扑主成分分析在单细胞 RNA 测序数据分析中的应用。
Comput Biol Med. 2024 Jun;175:108497. doi: 10.1016/j.compbiomed.2024.108497. Epub 2024 Apr 24.
7
Supervised capacity preserving mapping: a clustering guided visualization method for scRNA-seq data.监督容量保持映射:一种基于聚类的 scRNA-seq 数据可视化方法。
Bioinformatics. 2022 Apr 28;38(9):2496-2503. doi: 10.1093/bioinformatics/btac131.
8
Dimensionality reduction by UMAP reinforces sample heterogeneity analysis in bulk transcriptomic data.UMAP 通过降维增强了批量转录组数据中样本异质性分析。
Cell Rep. 2021 Jul 27;36(4):109442. doi: 10.1016/j.celrep.2021.109442.
9
A robust nonlinear low-dimensional manifold for single cell RNA-seq data.单细胞 RNA-seq 数据的稳健非线性低维流形。
BMC Bioinformatics. 2020 Jul 21;21(1):324. doi: 10.1186/s12859-020-03625-z.
10
SUSCC: Secondary Construction of Feature Space based on UMAP for Rapid and Accurate Clustering Large-scale Single Cell RNA-seq Data.SUSCC:基于 UMAP 的特征空间二次构建,用于快速准确地聚类大规模单细胞 RNA-seq 数据。
Interdiscip Sci. 2021 Mar;13(1):83-90. doi: 10.1007/s12539-020-00411-6. Epub 2021 Jan 21.

引用本文的文献

1
Exploring RNA-Seq Data Analysis Through Visualization Techniques and Tools: A Systematic Review of Opportunities and Limitations for Clinical Applications.通过可视化技术和工具探索RNA测序数据分析:临床应用的机遇与局限的系统综述
Bioengineering (Basel). 2025 Jan 12;12(1):56. doi: 10.3390/bioengineering12010056.
2
A physically inspired approach to coarse-graining transcriptomes reveals the dynamics of aging.一种受物理启发的粗粒化转录组方法揭示了衰老的动态。
PLoS One. 2024 Oct 29;19(10):e0301159. doi: 10.1371/journal.pone.0301159. eCollection 2024.
3
Data-driven selection of analysis decisions in single-cell RNA-seq trajectory inference.

本文引用的文献

1
Visualization of Single Cell RNA-Seq Data Using t-SNE in R.使用 R 中的 t-SNE 可视化单细胞 RNA-Seq 数据。
Methods Mol Biol. 2020;2117:159-167. doi: 10.1007/978-1-0716-0301-7_8.
2
Visualizing structure and transitions in high-dimensional biological data.高维生物数据中的结构和转变可视化。
Nat Biotechnol. 2019 Dec;37(12):1482-1492. doi: 10.1038/s41587-019-0336-3. Epub 2019 Dec 3.
3
Cell lineage and communication network inference via optimization for single-cell transcriptomics.通过单细胞转录组学优化推断细胞谱系和通讯网络。
基于数据驱动的单细胞 RNA-seq 轨迹推断中分析决策的选择。
Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae216.
4
nPCA: a linear dimensionality reduction method using a multilayer perceptron.nPCA:一种使用多层感知器的线性降维方法。
Front Genet. 2024 Jan 8;14:1290447. doi: 10.3389/fgene.2023.1290447. eCollection 2023.
5
Data-driven selection of analysis decisions in single-cell RNA-seq trajectory inference.单细胞RNA测序轨迹推断中基于数据驱动的分析决策选择
bioRxiv. 2023 Dec 19:2023.12.18.572214. doi: 10.1101/2023.12.18.572214.
6
Sparse representation learning derives biological features with explicit gene weights from the Allen Mouse Brain Atlas.稀疏表示学习从艾伦老鼠大脑图谱中获得具有明确基因权重的生物学特征。
PLoS One. 2023 Mar 6;18(3):e0282171. doi: 10.1371/journal.pone.0282171. eCollection 2023.
7
An analysis of classical multidimensional scaling with applications to clustering.经典多维缩放分析及其在聚类中的应用。
Inf inference. 2022 Apr 23;12(1):72-112. doi: 10.1093/imaiai/iaac004. eCollection 2023 Mar.
8
Haisu: Hierarchically supervised nonlinear dimensionality reduction.海苏:分层监督的非线性降维。
PLoS Comput Biol. 2022 Jul 21;18(7):e1010351. doi: 10.1371/journal.pcbi.1010351. eCollection 2022 Jul.
9
Transcriptomic Mapping of Neural Diversity, Differentiation and Functional Trajectory in iPSC-Derived 3D Brain Organoid Models.iPSC 衍生的 3D 脑类器官模型中神经多样性、分化和功能轨迹的转录组图谱
Cells. 2021 Dec 5;10(12):3422. doi: 10.3390/cells10123422.
10
Clustering Single-Cell RNA-Seq Data with Regularized Gaussian Graphical Model.基于正则化高斯图模型的单细胞 RNA-Seq 数据聚类。
Genes (Basel). 2021 Feb 22;12(2):311. doi: 10.3390/genes12020311.
Nucleic Acids Res. 2019 Jun 20;47(11):e66. doi: 10.1093/nar/gkz204.
4
SinNLRR: a robust subspace clustering method for cell type detection by non-negative and low-rank representation.SinNLRR:一种基于非负低秩表示的稳健子空间聚类方法,用于细胞类型检测。
Bioinformatics. 2019 Oct 1;35(19):3642-3650. doi: 10.1093/bioinformatics/btz139.
5
Fast interpolation-based t-SNE for improved visualization of single-cell RNA-seq data.基于快速插值的 t-SNE 用于改善单细胞 RNA-seq 数据的可视化。
Nat Methods. 2019 Mar;16(3):243-245. doi: 10.1038/s41592-018-0308-4. Epub 2019 Feb 11.
6
Dimensionality reduction for visualizing single-cell data using UMAP.使用UMAP进行单细胞数据可视化的降维方法。
Nat Biotechnol. 2018 Dec 3. doi: 10.1038/nbt.4314.
7
Integrating single-cell transcriptomic data across different conditions, technologies, and species.整合不同条件、技术和物种的单细胞转录组数据。
Nat Biotechnol. 2018 Jun;36(5):411-420. doi: 10.1038/nbt.4096. Epub 2018 Apr 2.
8
Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors.通过匹配相互最近邻,纠正单细胞 RNA 测序数据中的批次效应。
Nat Biotechnol. 2018 Jun;36(5):421-427. doi: 10.1038/nbt.4091. Epub 2018 Apr 2.
9
An interpretable framework for clustering single-cell RNA-Seq datasets.用于聚类单细胞 RNA-Seq 数据集的可解释框架。
BMC Bioinformatics. 2018 Mar 9;19(1):93. doi: 10.1186/s12859-018-2092-7.
10
SC3: consensus clustering of single-cell RNA-seq data.SC3:单细胞RNA测序数据的一致性聚类
Nat Methods. 2017 May;14(5):483-486. doi: 10.1038/nmeth.4236. Epub 2017 Mar 27.