School of Biological Sciences, University of Auckland, Auckland, New Zealand.
Research School of Biology, Australian National University, Canberra, ACT, Australia.
BMC Bioinformatics. 2024 Sep 27;25(1):308. doi: 10.1186/s12859-024-05928-x.
The application of Uniform Manifold Approximation and Projection (UMAP) for dimensionality reduction and visualization has revolutionized the analysis of single-cell RNA expression and population genetics. However, its potential in single-cell DNA sequencing data analysis, particularly for visualizing gene mutation information, has not been fully explored.
We introduce Mugen-UMAP, a novel Python-based program that extends UMAP's utility to single-cell DNA sequencing data. This innovative tool provides a comprehensive pipeline for processing gene annotation files of single-cell somatic single-nucleotide variants and metadata to the visualization of UMAP projections for identifying clusters, along with various statistical analyses. Employing Mugen-UMAP, we analyzed whole-exome sequencing data from 365 single-cell samples across 12 non-small cell lung cancer (NSCLC) patients, revealing distinct clusters associated with histological subtypes of NSCLC. Moreover, to demonstrate the general utility of Mugen-UMAP, we applied the program to 9 additional single-cell WES datasets from various cancer types, uncovering interesting patterns of cell clusters that warrant further investigation. In summary, Mugen-UMAP provides a quick and effective visualization method to uncover cell cluster patterns based on the gene mutation information from single-cell DNA sequencing data.
The application of Mugen-UMAP demonstrates its capacity to provide valuable insights into the visualization and interpretation of single-cell DNA sequencing data. Mugen-UMAP can be found at https://github.com/tengchn/Mugen-UMAP.
均匀流形逼近和投影 (UMAP) 在单细胞 RNA 表达和群体遗传学分析中的应用已经彻底改变了分析方法。然而,UMAP 在单细胞 DNA 测序数据分析中的应用潜力,特别是在可视化基因突变信息方面,尚未得到充分探索。
我们引入了 Mugen-UMAP,这是一种基于 Python 的新程序,扩展了 UMAP 在单细胞 DNA 测序数据中的应用。这个创新的工具提供了一个全面的流程,用于处理单细胞体细胞单核苷酸变异和元数据的基因注释文件,以可视化 UMAP 投影,用于识别集群,以及各种统计分析。我们使用 Mugen-UMAP 分析了来自 12 名非小细胞肺癌 (NSCLC) 患者的 365 个单细胞样本的全外显子测序数据,揭示了与 NSCLC 组织学亚型相关的不同集群。此外,为了展示 Mugen-UMAP 的通用性,我们将该程序应用于来自不同癌症类型的 9 个额外的单细胞 WES 数据集,揭示了有趣的细胞簇模式,值得进一步研究。总之,Mugen-UMAP 提供了一种快速有效的可视化方法,基于单细胞 DNA 测序数据中的基因突变信息来揭示细胞簇模式。
Mugen-UMAP 的应用证明了它能够为单细胞 DNA 测序数据分析的可视化和解释提供有价值的见解。Mugen-UMAP 可在 https://github.com/tengchn/Mugen-UMAP 上找到。