Suppr超能文献

多视图数据可视化 流形学习

Multi-view data visualisation manifold learning.

作者信息

Rodosthenous Theodoulos, Shahrezaei Vahid, Evangelou Marina

机构信息

Department of Mathematics, Imperial College London, London, United Kingdom.

出版信息

PeerJ Comput Sci. 2024 May 24;10:e1993. doi: 10.7717/peerj-cs.1993. eCollection 2024.

Abstract

Non-linear dimensionality reduction can be performed by manifold learning approaches, such as stochastic neighbour embedding (SNE), locally linear embedding (LLE) and isometric feature mapping (ISOMAP). These methods aim to produce two or three latent embeddings, primarily to visualise the data in intelligible representations. This manuscript proposes extensions of Student's t-distributed SNE (t-SNE), LLE and ISOMAP, for dimensionality reduction and visualisation of multi-view data. Multi-view data refers to multiple types of data generated from the same samples. The proposed multi-view approaches provide more comprehensible projections of the samples compared to the ones obtained by visualising each data-view separately. Commonly, visualisation is used for identifying underlying patterns within the samples. By incorporating the obtained low-dimensional embeddings from the multi-view manifold approaches into the K-means clustering algorithm, it is shown that clusters of the samples are accurately identified. Through extensive comparisons of novel and existing multi-view manifold learning algorithms on real and synthetic data, the proposed multi-view extension of t-SNE, named multi-SNE, is found to have the best performance, quantified both qualitatively and quantitatively by assessing the clusterings obtained. The applicability of multi-SNE is illustrated by its implementation in the newly developed and challenging multi-omics single-cell data. The aim is to visualise and identify cell heterogeneity and cell types in biological tissues relevant to health and disease. In this application, multi-SNE provides an improved performance over single-view manifold learning approaches and a promising solution for unified clustering of multi-omics single-cell data.

摘要

非线性降维可以通过流形学习方法来实现,比如随机邻域嵌入(SNE)、局部线性嵌入(LLE)和等距特征映射(ISOMAP)。这些方法旨在生成两到三个潜在嵌入,主要目的是以可理解的表示形式可视化数据。本文提出了学生t分布SNE(t-SNE)、LLE和ISOMAP的扩展方法,用于多视图数据的降维和可视化。多视图数据是指从同一样本生成的多种类型的数据。与分别可视化每个数据视图所获得的投影相比,所提出的多视图方法提供了更易于理解的样本投影。通常,可视化用于识别样本中的潜在模式。通过将从多视图流形方法获得的低维嵌入纳入K均值聚类算法,结果表明样本的聚类能够被准确识别。通过在真实数据和合成数据上对新颖的和现有的多视图流形学习算法进行广泛比较,发现所提出的t-SNE多视图扩展方法(称为多SNE)具有最佳性能,通过评估所获得的聚类结果在定性和定量方面进行了量化。多SNE的适用性通过其在新开发的具有挑战性的多组学单细胞数据中的实现得到了说明。目的是可视化和识别与健康和疾病相关的生物组织中的细胞异质性和细胞类型。在这个应用中,多SNE相对于单视图流形学习方法提供了更好的性能,并且为多组学单细胞数据的统一聚类提供了一个有前景的解决方案。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ae10/11157621/e6f3751ae1e1/peerj-cs-10-1993-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验