• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

形状感知随机近邻嵌入的稳健数据可视化。

Shape-aware stochastic neighbor embedding for robust data visualisations.

机构信息

Department of Mathematics, Stockholm University, Stockholm, Sweden.

出版信息

BMC Bioinformatics. 2022 Nov 14;23(1):477. doi: 10.1186/s12859-022-05028-8.

DOI:10.1186/s12859-022-05028-8
PMID:36376789
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9660178/
Abstract

BACKGROUND

The t-distributed Stochastic Neighbor Embedding (t-SNE) algorithm has emerged as one of the leading methods for visualising high-dimensional (HD) data in a wide variety of fields, especially for revealing cluster structure in HD single-cell transcriptomics data. However, t-SNE often fails to correctly represent hierarchical relationships between clusters and creates spurious patterns in the embedding. In this work we generalised t-SNE using shape-aware graph distances to mitigate some of the limitations of the t-SNE. Although many methods have been recently proposed to circumvent the shortcomings of t-SNE, notably Uniform manifold approximation (UMAP) and Potential of heat diffusion for affinity-based transition embedding (PHATE), we see a clear advantage of the proposed graph-based method.

RESULTS

The superior performance of the proposed method is first demonstrated on simulated data, where a significant improvement compared to t-SNE, UMAP and PHATE, based on quantitative validation indices, is observed when visualising imbalanced, nonlinear, continuous and hierarchically structured data. Thereafter the ability of the proposed method compared to the competing methods to create faithfully low-dimensional embeddings is shown on two real-world data sets, the single-cell transcriptomics data and the MNIST image data. In addition, the only hyper-parameter of the method can be automatically chosen in a data-driven way, which is consistently optimal across all test cases in this study.

CONCLUSIONS

In this work we show that the proposed shape-aware stochastic neighbor embedding method creates low-dimensional visualisations that robustly and accurately reveal key structures of high-dimensional data.

摘要

背景

t 分布随机近邻嵌入(t-SNE)算法已成为在广泛领域中可视化高维(HD)数据的主要方法之一,尤其是在揭示 HD 单细胞转录组学数据中的聚类结构方面。然而,t-SNE 通常无法正确表示聚类之间的层次关系,并在嵌入中产生虚假模式。在这项工作中,我们使用形状感知图距离对 t-SNE 进行了推广,以减轻 t-SNE 的一些局限性。尽管最近已经提出了许多方法来规避 t-SNE 的缺点,特别是均匀流形逼近(UMAP)和基于热扩散势的相似性转移嵌入(PHATE),但我们看到了所提出的基于图的方法的明显优势。

结果

该方法的优越性能首先在模拟数据上得到了验证,与 t-SNE、UMAP 和 PHATE 相比,在可视化不平衡、非线性、连续和层次结构数据时,基于定量验证指标,观察到了显著的改进。然后,在所提出的方法与竞争方法之间的能力进行了比较,以创建真实的低维嵌入,使用了两个真实世界的数据集,单细胞转录组学数据和 MNIST 图像数据。此外,该方法的唯一超参数可以以数据驱动的方式自动选择,在本研究的所有测试案例中都是一致最优的。

结论

在这项工作中,我们表明所提出的形状感知随机近邻嵌入方法可以创建低维可视化,这些可视化能够稳健且准确地揭示高维数据的关键结构。

相似文献

1
Shape-aware stochastic neighbor embedding for robust data visualisations.形状感知随机近邻嵌入的稳健数据可视化。
BMC Bioinformatics. 2022 Nov 14;23(1):477. doi: 10.1186/s12859-022-05028-8.
2
Statistical method scDEED for detecting dubious 2D single-cell embeddings and optimizing t-SNE and UMAP hyperparameters.用于检测可疑的 2D 单细胞嵌入并优化 t-SNE 和 UMAP 参数的统计方法 scDEED。
Nat Commun. 2024 Feb 26;15(1):1753. doi: 10.1038/s41467-024-45891-y.
3
Dimensionality reduction by UMAP reinforces sample heterogeneity analysis in bulk transcriptomic data.UMAP 通过降维增强了批量转录组数据中样本异质性分析。
Cell Rep. 2021 Jul 27;36(4):109442. doi: 10.1016/j.celrep.2021.109442.
4
The art of using t-SNE for single-cell transcriptomics.使用 t-SNE 进行单细胞转录组学分析的艺术。
Nat Commun. 2019 Nov 28;10(1):5416. doi: 10.1038/s41467-019-13056-x.
5
Evaluation of Distance Metrics and Spatial Autocorrelation in Uniform Manifold Approximation and Projection Applied to Mass Spectrometry Imaging Data.基于均摊近似和投影的距离度量和空间自相关评估及其在质谱成像数据中的应用。
Anal Chem. 2019 May 7;91(9):5706-5714. doi: 10.1021/acs.analchem.8b05827. Epub 2019 Apr 25.
6
Self-Organizing Nebulous Growths for Robust and Incremental Data Visualization.用于稳健且增量式数据可视化的自组织星云状生长
IEEE Trans Neural Netw Learn Syst. 2021 Oct;32(10):4588-4602. doi: 10.1109/TNNLS.2020.3023941. Epub 2021 Oct 5.
7
A cross entropy test allows quantitative statistical comparison of t-SNE and UMAP representations.交叉熵测试允许对 t-SNE 和 UMAP 表示进行定量统计比较。
Cell Rep Methods. 2023 Jan 13;3(1):100390. doi: 10.1016/j.crmeth.2022.100390. eCollection 2023 Jan 23.
8
A Preprocessing Manifold Learning Strategy Based on t-Distributed Stochastic Neighbor Embedding.一种基于t分布随机邻域嵌入的预处理流形学习策略
Entropy (Basel). 2023 Jul 14;25(7):1065. doi: 10.3390/e25071065.
9
Multi-view data visualisation manifold learning.多视图数据可视化 流形学习
PeerJ Comput Sci. 2024 May 24;10:e1993. doi: 10.7717/peerj-cs.1993. eCollection 2024.
10
Heavy-tailed kernels reveal a finer cluster structure in t-SNE visualisations.重尾核在t-SNE可视化中揭示了更精细的聚类结构。
Mach Learn Knowl Discov Databases. 2020;11906:124-139. doi: 10.1007/978-3-030-46150-8_8. Epub 2020 Apr 30.

引用本文的文献

1
Magnetoencephalography Dimensionality Reduction Informed by Dynamic Brain States.基于动态脑状态的脑磁图降维
Eur J Neurosci. 2025 May;61(9):e70128. doi: 10.1111/ejn.70128.

本文引用的文献

1
Phenotypic variation of transcriptomic cell types in mouse motor cortex.小鼠运动皮层转录组细胞类型的表型变异。
Nature. 2021 Oct;598(7879):144-150. doi: 10.1038/s41586-020-2907-3. Epub 2020 Nov 12.
2
Deep immune profiling of COVID-19 patients reveals distinct immunotypes with therapeutic implications.深度免疫剖析 COVID-19 患者,揭示具有治疗意义的不同免疫类型。
Science. 2020 Sep 4;369(6508). doi: 10.1126/science.abc8511. Epub 2020 Jul 15.
3
Visualizing structure and transitions in high-dimensional biological data.高维生物数据中的结构和转变可视化。
Nat Biotechnol. 2019 Dec;37(12):1482-1492. doi: 10.1038/s41587-019-0336-3. Epub 2019 Dec 3.
4
The art of using t-SNE for single-cell transcriptomics.使用 t-SNE 进行单细胞转录组学分析的艺术。
Nat Commun. 2019 Nov 28;10(1):5416. doi: 10.1038/s41467-019-13056-x.
5
Layer 4 of mouse neocortex differs in cell types and circuit organization between sensory areas.鼠大脑新皮层的第 4 层在感觉区域之间的细胞类型和回路组织上存在差异。
Nat Commun. 2019 Sep 13;10(1):4174. doi: 10.1038/s41467-019-12058-z.
6
Fast interpolation-based t-SNE for improved visualization of single-cell RNA-seq data.基于快速插值的 t-SNE 用于改善单细胞 RNA-seq 数据的可视化。
Nat Methods. 2019 Mar;16(3):243-245. doi: 10.1038/s41592-018-0308-4. Epub 2019 Feb 11.
7
Dimensionality reduction for visualizing single-cell data using UMAP.使用UMAP进行单细胞数据可视化的降维方法。
Nat Biotechnol. 2018 Dec 3. doi: 10.1038/nbt.4314.
8
Classes and continua of hippocampal CA1 inhibitory neurons revealed by single-cell transcriptomics.单细胞转录组学揭示海马 CA1 抑制性神经元的分类和连续性。
PLoS Biol. 2018 Jun 18;16(6):e2006387. doi: 10.1371/journal.pbio.2006387. eCollection 2018 Jun.
9
Single-cell mapping of gene expression landscapes and lineage in the zebrafish embryo.单细胞映射斑马鱼胚胎中的基因表达图谱和谱系。
Science. 2018 Jun 1;360(6392):981-987. doi: 10.1126/science.aar4362. Epub 2018 Apr 26.
10
Spectral coarse graining of complex networks.复杂网络的频谱粗粒化
Phys Rev Lett. 2007 Jul 20;99(3):038701. doi: 10.1103/PhysRevLett.99.038701. Epub 2007 Jul 19.