• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于检测可疑的 2D 单细胞嵌入并优化 t-SNE 和 UMAP 参数的统计方法 scDEED。

Statistical method scDEED for detecting dubious 2D single-cell embeddings and optimizing t-SNE and UMAP hyperparameters.

机构信息

Department of ISOM, School of Business and Management, Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong, China.

Department of Statistics and Data Science, University of California, Los Angeles, Los Angeles, CA, USA.

出版信息

Nat Commun. 2024 Feb 26;15(1):1753. doi: 10.1038/s41467-024-45891-y.

DOI:10.1038/s41467-024-45891-y
PMID:38409103
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10897166/
Abstract

Two-dimensional (2D) embedding methods are crucial for single-cell data visualization. Popular methods such as t-distributed stochastic neighbor embedding (t-SNE) and uniform manifold approximation and projection (UMAP) are commonly used for visualizing cell clusters; however, it is well known that t-SNE and UMAP's 2D embeddings might not reliably inform the similarities among cell clusters. Motivated by this challenge, we present a statistical method, scDEED, for detecting dubious cell embeddings output by a 2D-embedding method. By calculating a reliability score for every cell embedding based on the similarity between the cell's 2D-embedding neighbors and pre-embedding neighbors, scDEED identifies the cell embeddings with low reliability scores as dubious and those with high reliability scores as trustworthy. Moreover, by minimizing the number of dubious cell embeddings, scDEED provides intuitive guidance for optimizing the hyperparameters of an embedding method. We show the effectiveness of scDEED on multiple datasets for detecting dubious cell embeddings and optimizing the hyperparameters of t-SNE and UMAP.

摘要

二维(2D)嵌入方法对于单细胞数据可视化至关重要。流行的方法,如 t 分布随机邻居嵌入(t-SNE)和一致流形逼近和投影(UMAP),通常用于可视化细胞簇;然而,众所周知,t-SNE 和 UMAP 的 2D 嵌入可能无法可靠地反映细胞簇之间的相似性。受此挑战的启发,我们提出了一种统计方法 scDEED,用于检测二维嵌入方法输出的可疑细胞嵌入。通过根据细胞的 2D 嵌入邻居和预嵌入邻居之间的相似性为每个细胞嵌入计算可靠性得分,scDEED 将低可靠性得分的细胞嵌入识别为可疑的,而将高可靠性得分的细胞嵌入识别为可信的。此外,通过最小化可疑细胞嵌入的数量,scDEED 为优化嵌入方法的超参数提供了直观的指导。我们在多个数据集上展示了 scDEED 检测可疑细胞嵌入和优化 t-SNE 和 UMAP 超参数的有效性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7d9f/10897166/7f56a5295f45/41467_2024_45891_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7d9f/10897166/dce1de669bf2/41467_2024_45891_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7d9f/10897166/148e31ed6d80/41467_2024_45891_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7d9f/10897166/f94b6da59086/41467_2024_45891_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7d9f/10897166/6e248ea9b815/41467_2024_45891_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7d9f/10897166/f33217b3fe4c/41467_2024_45891_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7d9f/10897166/b1d71165b3d0/41467_2024_45891_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7d9f/10897166/388094c8a798/41467_2024_45891_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7d9f/10897166/af2c0c632464/41467_2024_45891_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7d9f/10897166/1280de065d09/41467_2024_45891_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7d9f/10897166/7f56a5295f45/41467_2024_45891_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7d9f/10897166/dce1de669bf2/41467_2024_45891_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7d9f/10897166/148e31ed6d80/41467_2024_45891_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7d9f/10897166/f94b6da59086/41467_2024_45891_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7d9f/10897166/6e248ea9b815/41467_2024_45891_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7d9f/10897166/f33217b3fe4c/41467_2024_45891_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7d9f/10897166/b1d71165b3d0/41467_2024_45891_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7d9f/10897166/388094c8a798/41467_2024_45891_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7d9f/10897166/af2c0c632464/41467_2024_45891_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7d9f/10897166/1280de065d09/41467_2024_45891_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7d9f/10897166/7f56a5295f45/41467_2024_45891_Fig10_HTML.jpg

相似文献

1
Statistical method scDEED for detecting dubious 2D single-cell embeddings and optimizing t-SNE and UMAP hyperparameters.用于检测可疑的 2D 单细胞嵌入并优化 t-SNE 和 UMAP 参数的统计方法 scDEED。
Nat Commun. 2024 Feb 26;15(1):1753. doi: 10.1038/s41467-024-45891-y.
2
scDEED: a statistical method for detecting dubious 2D single-cell embeddings and optimizing t-SNE and UMAP hyperparameters.scDEED:一种用于检测可疑二维单细胞嵌入并优化t-SNE和UMAP超参数的统计方法。
bioRxiv. 2023 Sep 15:2023.04.21.537839. doi: 10.1101/2023.04.21.537839.
3
Dimensionality reduction by UMAP reinforces sample heterogeneity analysis in bulk transcriptomic data.UMAP 通过降维增强了批量转录组数据中样本异质性分析。
Cell Rep. 2021 Jul 27;36(4):109442. doi: 10.1016/j.celrep.2021.109442.
4
Shape-aware stochastic neighbor embedding for robust data visualisations.形状感知随机近邻嵌入的稳健数据可视化。
BMC Bioinformatics. 2022 Nov 14;23(1):477. doi: 10.1186/s12859-022-05028-8.
5
A cross entropy test allows quantitative statistical comparison of t-SNE and UMAP representations.交叉熵测试允许对 t-SNE 和 UMAP 表示进行定量统计比较。
Cell Rep Methods. 2023 Jan 13;3(1):100390. doi: 10.1016/j.crmeth.2022.100390. eCollection 2023 Jan 23.
6
Evaluation of Distance Metrics and Spatial Autocorrelation in Uniform Manifold Approximation and Projection Applied to Mass Spectrometry Imaging Data.基于均摊近似和投影的距离度量和空间自相关评估及其在质谱成像数据中的应用。
Anal Chem. 2019 May 7;91(9):5706-5714. doi: 10.1021/acs.analchem.8b05827. Epub 2019 Apr 25.
7
DGCyTOF: Deep learning with graphic cluster visualization to predict cell types of single cell mass cytometry data.DGCyTOF:基于图形聚类可视化的深度学习,用于预测单细胞质谱流式细胞术数据的细胞类型。
PLoS Comput Biol. 2022 Apr 11;18(4):e1008885. doi: 10.1371/journal.pcbi.1008885. eCollection 2022 Apr.
8
Application of Uniform Manifold Approximation and Projection (UMAP) in spectral imaging of artworks.统一流形逼近与投影(UMAP)在艺术品光谱成像中的应用。
Spectrochim Acta A Mol Biomol Spectrosc. 2021 May 5;252:119547. doi: 10.1016/j.saa.2021.119547. Epub 2021 Feb 4.
9
Capturing discrete latent structures: choose LDs over PCs.捕捉离散潜在结构:选择潜在因子而非主成分。
Biostatistics. 2022 Dec 12;24(1):1-16. doi: 10.1093/biostatistics/kxab030.
10
Fuzzy Information Discrimination Measures and Their Application to Low Dimensional Embedding Construction in the UMAP Algorithm.模糊信息判别度量及其在UMAP算法中低维嵌入构建中的应用。
J Imaging. 2022 Apr 15;8(4):113. doi: 10.3390/jimaging8040113.

引用本文的文献

1
shinyUMAP: an online tool for promoting understanding of single cell omics data visualization.闪亮UMAP:一个促进对单细胞组学数据可视化理解的在线工具。
bioRxiv. 2025 Sep 1:2025.08.27.672621. doi: 10.1101/2025.08.27.672621.
2
Heterogeneity of the liver cancer tumor microenvironment: mitochondrial metabolism and causal inference through Mendelian randomization.肝癌肿瘤微环境的异质性:线粒体代谢与孟德尔随机化因果推断
Discov Oncol. 2025 Jul 29;16(1):1436. doi: 10.1007/s12672-025-02535-x.
3
Paradigms, innovations, and biological applications of RNA velocity: a comprehensive review.

本文引用的文献

1
Dynamic visualization of high-dimensional data.高维数据的动态可视化。
Nat Comput Sci. 2023 Jan;3(1):86-100. doi: 10.1038/s43588-022-00380-4. Epub 2022 Dec 30.
2
The specious art of single-cell genomics.单细胞基因组学的似是而非的艺术。
PLoS Comput Biol. 2023 Aug 17;19(8):e1011288. doi: 10.1371/journal.pcbi.1011288. eCollection 2023 Aug.
3
scDesign3 generates realistic in silico data for multimodal single-cell and spatial omics.scDesign3 生成用于多模态单细胞和空间基因组学的逼真的计算机模拟数据。
RNA速度的范式、创新及生物学应用:全面综述
Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf339.
4
Quality prediction method for automotive body resistance spot welding based on digital twin technology.基于数字孪生技术的汽车车身电阻点焊质量预测方法
Sci Rep. 2025 Jul 8;15(1):24391. doi: 10.1038/s41598-025-09959-z.
5
Iterative clustering algorithm G-DESC-E and pan-cancer key gene analysis based on single-cell sequencing data.基于单细胞测序数据的迭代聚类算法G-DESC-E与泛癌关键基因分析
Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf288.
6
Evaluating discrepancies in dimensionality reduction for time-series single-cell RNA-sequencing data.评估时间序列单细胞RNA测序数据降维中的差异。
Brief Bioinform. 2025 May 1;26(3). doi: 10.1093/bib/bbaf287.
7
Single-cell RNA seq data analysis reveals molecular markers and possible treatment targets for laryngeal squamous cell carcinoma (LSCC): an in-silico approach.单细胞RNA测序数据分析揭示喉鳞状细胞癌(LSCC)的分子标志物和潜在治疗靶点:一种计算机模拟方法。
In Silico Pharmacol. 2025 Jun 17;13(2):89. doi: 10.1007/s40203-025-00382-w. eCollection 2025.
8
Assessing and improving reliability of neighbor embedding methods: a map-continuity perspective.评估和提高邻域嵌入方法的可靠性:地图连续性视角
Nat Commun. 2025 May 30;16(1):5037. doi: 10.1038/s41467-025-60434-9.
9
Identification of predictive subphenotypes for clinical outcomes using real world data and machine learning.利用真实世界数据和机器学习识别临床结局的预测性子表型。
Nat Commun. 2025 May 12;16(1):3797. doi: 10.1038/s41467-025-59092-8.
10
Target Screening and Single Cell Analysis of Diabetic Retinopathy and Hepatocarcinoma.糖尿病视网膜病变和肝癌的靶点筛选与单细胞分析
J Cell Mol Med. 2025 May;29(9):e70521. doi: 10.1111/jcmm.70521.
Nat Biotechnol. 2024 Feb;42(2):247-252. doi: 10.1038/s41587-023-01772-1. Epub 2023 May 11.
4
Towards a comprehensive evaluation of dimension reduction methods for transcriptomic data visualization.面向转录组数据可视化的降维方法综合评估。
Commun Biol. 2022 Jul 19;5(1):719. doi: 10.1038/s42003-022-03628-x.
5
EMBEDR: Distinguishing signal from noise in single-cell omics data.EMBEDR:在单细胞组学数据中区分信号与噪声。
Patterns (N Y). 2022 Feb 8;3(3):100443. doi: 10.1016/j.patter.2022.100443. eCollection 2022 Mar 11.
6
Simulating Single-Cell Gene Expression Count Data with Preserved Gene Correlations by scDesign2.通过 scDesign2 模拟具有保留基因相关性的单细胞基因表达计数数据。
J Comput Biol. 2022 Jan;29(1):23-26. doi: 10.1089/cmb.2021.0440. Epub 2022 Jan 11.
7
No evidence that plasmablasts transdifferentiate into developing neutrophils in severe COVID-19 disease.没有证据表明在重症 COVID-19 疾病中浆母细胞会转分化为发育中的中性粒细胞。
Clin Transl Immunology. 2021 Jun 30;10(7):e1308. doi: 10.1002/cti2.1308. eCollection 2021.
8
Deep generative model embedding of single-cell RNA-Seq profiles on hyperspheres and hyperbolic spaces.基于超球和双曲空间的单细胞 RNA-Seq 图谱的深度生成模型嵌入。
Nat Commun. 2021 May 5;12(1):2554. doi: 10.1038/s41467-021-22851-4.
9
A generalization of t-SNE and UMAP to single-cell multimodal omics.单细胞多模态组学中 t-SNE 和 UMAP 的推广
Genome Biol. 2021 May 3;22(1):130. doi: 10.1186/s13059-021-02356-5.
10
Initialization is critical for preserving global data structure in both t-SNE and UMAP.初始化对于在t-SNE和UMAP中保存全局数据结构至关重要。
Nat Biotechnol. 2021 Feb;39(2):156-157. doi: 10.1038/s41587-020-00809-z. Epub 2021 Feb 1.