• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过CCP辅助的UMAP和tSNE分析单细胞RNA测序数据。

Analyzing scRNA-seq data by CCP-assisted UMAP and tSNE.

作者信息

Hozumi Yuta, Wei Guo-Wei

机构信息

Department of Mathematics, Michigan State University, East Lansing, Michigan, United States of America.

Department of Electrical and Computer Engineering, Michigan State University, East Lansing, Michigan, United States of America.

出版信息

PLoS One. 2024 Dec 13;19(12):e0311791. doi: 10.1371/journal.pone.0311791. eCollection 2024.

DOI:10.1371/journal.pone.0311791
PMID:39671349
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11642954/
Abstract

Single-cell RNA sequencing (scRNA-seq) is widely used to reveal heterogeneity in cells, which has given us insights into cell-cell communication, cell differentiation, and differential gene expression. However, analyzing scRNA-seq data is a challenge due to sparsity and the large number of genes involved. Therefore, dimensionality reduction and feature selection are important for removing spurious signals and enhancing downstream analysis. Correlated clustering and projection (CCP) was recently introduced as an effective method for preprocessing scRNA-seq data. CCP utilizes gene-gene correlations to partition the genes and, based on the partition, employs cell-cell interactions to obtain super-genes. Because CCP is a data-domain approach that does not require matrix diagonalization, it can be used in many downstream machine learning tasks. In this work, we utilize CCP as an initialization tool for uniform manifold approximation and projection (UMAP) and t-distributed stochastic neighbor embedding (tSNE). By using 21 publicly available datasets, we have found that CCP significantly improves UMAP and tSNE visualization and dramatically improve their accuracy. More specifically, CCP improves UMAP by 22% in ARI, 14% in NMI and 15% in ECM, and improves tSNE by 11% in ARI, 9% in NMI and 8% in ECM.

摘要

单细胞RNA测序(scRNA-seq)被广泛用于揭示细胞的异质性,这使我们对细胞间通讯、细胞分化和基因表达差异有了深入了解。然而,由于数据稀疏性和涉及的基因数量众多,分析scRNA-seq数据是一项挑战。因此,降维和特征选择对于去除虚假信号和增强下游分析很重要。相关聚类和投影(CCP)最近被引入作为预处理scRNA-seq数据的有效方法。CCP利用基因-基因相关性对基因进行划分,并基于该划分,利用细胞-细胞相互作用获得超级基因。由于CCP是一种不需要矩阵对角化的数据域方法,它可用于许多下游机器学习任务。在这项工作中,我们将CCP用作均匀流形近似和投影(UMAP)以及t分布随机邻域嵌入(tSNE)的初始化工具。通过使用21个公开可用的数据集,我们发现CCP显著改善了UMAP和tSNE的可视化效果,并大幅提高了它们的准确性。更具体地说,CCP在ARI中使UMAP提高了22%,在NMI中提高了14%,在ECM中提高了15%,在ARI中使tSNE提高了11%,在NMI中提高了9%,在ECM中提高了8%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71c7/11642954/e77668f92e0f/pone.0311791.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71c7/11642954/3d85b98adbf6/pone.0311791.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71c7/11642954/eebbcd799314/pone.0311791.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71c7/11642954/73336848cb3c/pone.0311791.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71c7/11642954/5b2050c670ce/pone.0311791.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71c7/11642954/91e3b51db7a1/pone.0311791.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71c7/11642954/ca1d84ffa291/pone.0311791.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71c7/11642954/75d9f3fe50af/pone.0311791.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71c7/11642954/aaa7324ed54f/pone.0311791.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71c7/11642954/e4203ad2042e/pone.0311791.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71c7/11642954/e77668f92e0f/pone.0311791.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71c7/11642954/3d85b98adbf6/pone.0311791.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71c7/11642954/eebbcd799314/pone.0311791.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71c7/11642954/73336848cb3c/pone.0311791.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71c7/11642954/5b2050c670ce/pone.0311791.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71c7/11642954/91e3b51db7a1/pone.0311791.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71c7/11642954/ca1d84ffa291/pone.0311791.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71c7/11642954/75d9f3fe50af/pone.0311791.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71c7/11642954/aaa7324ed54f/pone.0311791.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71c7/11642954/e4203ad2042e/pone.0311791.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71c7/11642954/e77668f92e0f/pone.0311791.g010.jpg

相似文献

1
Analyzing scRNA-seq data by CCP-assisted UMAP and tSNE.通过CCP辅助的UMAP和tSNE分析单细胞RNA测序数据。
PLoS One. 2024 Dec 13;19(12):e0311791. doi: 10.1371/journal.pone.0311791. eCollection 2024.
2
K-nearest-neighbors induced topological PCA for single cell RNA-sequence data analysis.K 近邻诱导拓扑主成分分析在单细胞 RNA 测序数据分析中的应用。
Comput Biol Med. 2024 Jun;175:108497. doi: 10.1016/j.compbiomed.2024.108497. Epub 2024 Apr 24.
3
Preprocessing of Single Cell RNA Sequencing Data Using Correlated Clustering and Projection.单细胞 RNA 测序数据的相关聚类和投影预处理。
J Chem Inf Model. 2024 Apr 8;64(7):2829-2838. doi: 10.1021/acs.jcim.3c00674. Epub 2023 Jul 4.
4
K-Nearest-Neighbors Induced Topological PCA for Single Cell RNA-Sequence Data Analysis.用于单细胞RNA序列数据分析的K近邻诱导拓扑主成分分析
ArXiv. 2023 Oct 23:arXiv:2310.14521v1.
5
Dimensionality Reduction of Single-Cell RNA-Seq Data.单细胞 RNA-Seq 数据的降维处理。
Methods Mol Biol. 2021;2284:331-342. doi: 10.1007/978-1-0716-1307-8_18.
6
Graph contrastive learning as a versatile foundation for advanced scRNA-seq data analysis.图对比学习作为高级 scRNA-seq 数据分析的多功能基础。
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae558.
7
Identifying cell states in single-cell RNA-seq data at statistically maximal resolution.以统计学上最大分辨率识别单细胞 RNA-seq 数据中的细胞状态。
PLoS Comput Biol. 2024 Jul 12;20(7):e1012224. doi: 10.1371/journal.pcbi.1012224. eCollection 2024 Jul.
8
scMUG: deep clustering analysis of single-cell RNA-seq data on multiple gene functional modules.scMUG:基于多个基因功能模块的单细胞RNA测序数据深度聚类分析
Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf138.
9
scDFN: enhancing single-cell RNA-seq clustering with deep fusion networks.scDFN:利用深度融合网络增强单细胞 RNA-seq 聚类
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae486.
10
Multi-level multi-view network based on structural contrastive learning for scRNA-seq data clustering.基于结构对比学习的多层次多视图网络用于 scRNA-seq 数据聚类。
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae562.

引用本文的文献

1
Dimensionality reduction for k-means clustering of large-scale influenza mutation datasets.用于大规模流感突变数据集k均值聚类的降维方法
ArXiv. 2025 Apr 4:arXiv:2504.03550v1.
2
Assessing the clinical applicability of dimensionality reduction algorithms in flow cytometry for hematologic malignancies.评估降维算法在血液系统恶性肿瘤流式细胞术中的临床适用性。
Clin Chem Lab Med. 2025 Feb 27;63(7):1432-1442. doi: 10.1515/cclm-2025-0017. Print 2025 Jun 26.

本文引用的文献

1
Spontaneous breaking of symmetry in overlapping cell instance segmentation using diffusion models.使用扩散模型在重叠细胞实例分割中实现对称性的自发破缺
Biol Methods Protoc. 2024 Nov 9;9(1):bpae084. doi: 10.1093/biomethods/bpae084. eCollection 2024.
2
Scalable integration of multiomic single-cell data using generative adversarial networks.基于生成对抗网络的多组学单细胞数据可扩展整合。
Bioinformatics. 2024 May 2;40(5). doi: 10.1093/bioinformatics/btae300.
3
iDNA-OpenPrompt: OpenPrompt learning model for identifying DNA methylation.
iDNA-OpenPrompt:用于识别DNA甲基化的OpenPrompt学习模型。
Front Genet. 2024 Apr 16;15:1377285. doi: 10.3389/fgene.2024.1377285. eCollection 2024.
4
Analyzing Single Cell RNA Sequencing with Topological Nonnegative Matrix Factorization.使用拓扑非负矩阵分解分析单细胞RNA测序
J Comput Appl Math. 2024 Aug 1;445. doi: 10.1016/j.cam.2024.115842. Epub 2024 Feb 19.
5
Advancing single-cell RNA-seq data analysis through the fusion of multi-layer perceptron and graph neural network.通过多层感知机和图神经网络的融合来推进单细胞 RNA-seq 数据分析。
Brief Bioinform. 2023 Nov 22;25(1). doi: 10.1093/bib/bbad481.
6
PLPCA: Persistent Laplacian-Enhanced PCA for Microarray Data Analysis.PLPCA:用于微阵列数据分析的持久拉普拉斯增强主成分分析。
J Chem Inf Model. 2024 Apr 8;64(7):2405-2420. doi: 10.1021/acs.jcim.3c01023. Epub 2023 Sep 22.
7
scAAGA: Single cell data analysis framework using asymmetric autoencoder with gene attention.scAAGA:使用具有基因注意力的不对称自动编码器的单细胞数据分析框架。
Comput Biol Med. 2023 Oct;165:107414. doi: 10.1016/j.compbiomed.2023.107414. Epub 2023 Aug 30.
8
Preprocessing of Single Cell RNA Sequencing Data Using Correlated Clustering and Projection.单细胞 RNA 测序数据的相关聚类和投影预处理。
J Chem Inf Model. 2024 Apr 8;64(7):2829-2838. doi: 10.1021/acs.jcim.3c00674. Epub 2023 Jul 4.
9
IChrom-Deep: An Attention-Based Deep Learning Model for Identifying Chromatin Interactions.IChrom-Deep:一种基于注意力的深度学习模型,用于识别染色质相互作用。
IEEE J Biomed Health Inform. 2023 Sep;27(9):4559-4568. doi: 10.1109/JBHI.2023.3292299. Epub 2023 Sep 6.
10
CIForm as a Transformer-based model for cell-type annotation of large-scale single-cell RNA-seq data.CIForm 作为一种基于 Transformer 的模型,用于大规模单细胞 RNA-seq 数据的细胞类型注释。
Brief Bioinform. 2023 Jul 20;24(4). doi: 10.1093/bib/bbad195.