重新审视用于视觉聚类分析的降维技术：一项实证研究。

Revisiting Dimensionality Reduction Techniques for Visual Cluster Analysis: An Empirical Study.

作者信息

Xia Jiazhi, Zhang Yuchen, Song Jie, Chen Yang, Wang Yunhai, Liu Shixia

出版信息

IEEE Trans Vis Comput Graph. 2022 Jan;28(1):529-539. doi: 10.1109/TVCG.2021.3114694. Epub 2021 Dec 24.

DOI:10.1109/TVCG.2021.3114694

Abstract

Dimensionality Reduction (DR) techniques can generate 2D projections and enable visual exploration of cluster structures of high-dimensional datasets. However, different DR techniques would yield various patterns, which significantly affect the performance of visual cluster analysis tasks. We present the results of a user study that investigates the influence of different DR techniques on visual cluster analysis. Our study focuses on the most concerned property types, namely the linearity and locality, and evaluates twelve representative DR techniques that cover the concerned properties. Four controlled experiments were conducted to evaluate how the DR techniques facilitate the tasks of 1) cluster identification, 2) membership identification, 3) distance comparison, and 4) density comparison, respectively. We also evaluated users' subjective preference of the DR techniques regarding the quality of projected clusters. The results show that: 1) Non-linear and Local techniques are preferred in cluster identification and membership identification; 2) Linear techniques perform better than non-linear techniques in density comparison; 3) UMAP (Uniform Manifold Approximation and Projection) and t-SNE (t-Distributed Stochastic Neighbor Embedding) perform the best in cluster identification and membership identification; 4) NMF (Nonnegative Matrix Factorization) has competitive performance in distance comparison; 5) t-SNLE (t-Distributed Stochastic Neighbor Linear Embedding) has competitive performance in density comparison.

摘要

降维（DR）技术可以生成二维投影，并能够对高维数据集的聚类结构进行可视化探索。然而，不同的降维技术会产生各种模式，这对视觉聚类分析任务的性能有显著影响。我们展示了一项用户研究的结果，该研究调查了不同降维技术对视觉聚类分析的影响。我们的研究聚焦于最受关注的属性类型，即线性和局部性，并评估了涵盖相关属性的十二种代表性降维技术。进行了四项对照实验，分别评估降维技术如何促进1）聚类识别、2）成员识别、3）距离比较和4）密度比较任务。我们还评估了用户对降维技术在投影聚类质量方面的主观偏好。结果表明：1）在聚类识别和成员识别中，非线性和局部技术更受青睐；2）在密度比较中，线性技术比非线性技术表现更好；3）UMAP（均匀流形近似与投影）和t-SNE（t分布随机邻域嵌入）在聚类识别和成员识别中表现最佳；4）非负矩阵分解（NMF）在距离比较中具有竞争力；5）t-SNLE（t分布随机邻域线性嵌入）在密度比较中具有竞争力。

相似文献

Revisiting Dimensionality Reduction Techniques for Visual Cluster Analysis: An Empirical Study.重新审视用于视觉聚类分析的降维技术：一项实证研究。

IEEE Trans Vis Comput Graph. 2022 Jan;28(1):529-539. doi: 10.1109/TVCG.2021.3114694. Epub 2021 Dec 24.

Dimensionality reduction by UMAP reinforces sample heterogeneity analysis in bulk transcriptomic data.UMAP 通过降维增强了批量转录组数据中样本异质性分析。

Cell Rep. 2021 Jul 27;36(4):109442. doi: 10.1016/j.celrep.2021.109442.

Analyzing Single Cell RNA Sequencing with Topological Nonnegative Matrix Factorization.使用拓扑非负矩阵分解分析单细胞RNA测序

J Comput Appl Math. 2024 Aug 1;445. doi: 10.1016/j.cam.2024.115842. Epub 2024 Feb 19.

UMAP as a Dimensionality Reduction Tool for Molecular Dynamics Simulations of Biomacromolecules: A Comparison Study.UMAP 作为生物大分子分子动力学模拟的降维工具：一项对比研究。

J Phys Chem B. 2021 May 20;125(19):5022-5034. doi: 10.1021/acs.jpcb.1c02081. Epub 2021 May 11.

Evaluation of Distance Metrics and Spatial Autocorrelation in Uniform Manifold Approximation and Projection Applied to Mass Spectrometry Imaging Data.基于均摊近似和投影的距离度量和空间自相关评估及其在质谱成像数据中的应用。

Anal Chem. 2019 May 7;91(9):5706-5714. doi: 10.1021/acs.analchem.8b05827. Epub 2019 Apr 25.

Embedding Functional Brain Networks in Low Dimensional Spaces Using Manifold Learning Techniques.使用流形学习技术将功能性脑网络嵌入低维空间

Front Neuroinform. 2021 Dec 24;15:740143. doi: 10.3389/fninf.2021.740143. eCollection 2021.

Interactive Visual Cluster Analysis by Contrastive Dimensionality Reduction.基于对比降维的交互式可视化聚类分析

IEEE Trans Vis Comput Graph. 2023 Jan;29(1):734-744. doi: 10.1109/TVCG.2022.3209423. Epub 2022 Dec 16.

A cross entropy test allows quantitative statistical comparison of t-SNE and UMAP representations.交叉熵测试允许对 t-SNE 和 UMAP 表示进行定量统计比较。

Cell Rep Methods. 2023 Jan 13;3(1):100390. doi: 10.1016/j.crmeth.2022.100390. eCollection 2023 Jan 23.

Comparison of dimension reduction techniques applied to the analysis of airborne radionuclide activity concentration.比较应用于分析空气放射性核素活度浓度的降维技术。

J Environ Radioact. 2022 Apr;244-245:106813. doi: 10.1016/j.jenvrad.2022.106813. Epub 2022 Jan 29.

Shape-aware stochastic neighbor embedding for robust data visualisations.形状感知随机近邻嵌入的稳健数据可视化。

BMC Bioinformatics. 2022 Nov 14;23(1):477. doi: 10.1186/s12859-022-05028-8.

引用本文的文献

AI-Enhanced evaluation of YouTube content on post-surgical incontinence following pelvic cancer treatment.人工智能辅助评估YouTube上关于盆腔癌治疗后手术失禁的内容。

SSM Popul Health. 2024 May 4;26:101677. doi: 10.1016/j.ssmph.2024.101677. eCollection 2024 Jun.

Cauchy hyper-graph Laplacian nonnegative matrix factorization for single-cell RNA-sequencing data analysis.基于柯西超图拉普拉斯的非负矩阵分解方法在单细胞 RNA-seq 数据分析中的应用。

BMC Bioinformatics. 2024 Apr 29;25(1):169. doi: 10.1186/s12859-024-05797-4.

Prognostic Model and Immune Infiltration of Ferroptosis Subcluster-Related Modular Genes in Gastric Cancer.胃癌中铁死亡亚群相关模块基因的预后模型及免疫浸润

J Oncol. 2022 Oct 13;2022:5813522. doi: 10.1155/2022/5813522. eCollection 2022.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

重新审视用于视觉聚类分析的降维技术：一项实证研究。

Revisiting Dimensionality Reduction Techniques for Visual Cluster Analysis: An Empirical Study.

作者信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献