• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

UMAP 通过降维增强了批量转录组数据中样本异质性分析。

Dimensionality reduction by UMAP reinforces sample heterogeneity analysis in bulk transcriptomic data.

机构信息

The University of Queensland Diamantina Institute, Faculty of Medicine, The University of Queensland, Translational Research Institute, Brisbane, QLD, Australia; Shandong Artificial Intelligence Institute, Qilu University of Technology (Shandong Academy of Sciences), Jinan, China.

Shandong Artificial Intelligence Institute, Qilu University of Technology (Shandong Academy of Sciences), Jinan, China; School of Microelectronics, Shandong University, Jinan, China.

出版信息

Cell Rep. 2021 Jul 27;36(4):109442. doi: 10.1016/j.celrep.2021.109442.

DOI:10.1016/j.celrep.2021.109442
PMID:34320340
Abstract

Transcriptomic analysis plays a key role in biomedical research. Linear dimensionality reduction methods, especially principal-component analysis (PCA), are widely used in detecting sample-to-sample heterogeneity, while recently developed non-linear methods, such as t-distributed stochastic neighbor embedding (t-SNE) and uniform manifold approximation and projection (UMAP), can efficiently cluster heterogeneous samples in single-cell RNA sequencing analysis. Yet, the application of t-SNE and UMAP in bulk transcriptomic analysis and comparison with conventional methods have not been achieved. We compare four major dimensionality reduction methods (PCA, multidimensional scaling [MDS], t-SNE, and UMAP) in analyzing 71 large bulk transcriptomic datasets. UMAP is superior to PCA and MDS but shows some advantages over t-SNE in differentiating batch effects, identifying pre-defined biological groups, and revealing in-depth clusters in two-dimensional space. Importantly, UMAP generates sample clusters uncovering biological features and clinical meaning. We recommend deploying UMAP in visualizing and analyzing sizable bulk transcriptomic datasets to reinforce sample heterogeneity analysis.

摘要

转录组分析在生物医学研究中起着关键作用。线性降维方法,特别是主成分分析(PCA),广泛用于检测样本间的异质性,而最近开发的非线性方法,如 t 分布随机邻域嵌入(t-SNE)和一致流形逼近和投影(UMAP),可在单细胞 RNA 测序分析中有效地对异质样本进行聚类。然而,t-SNE 和 UMAP 在批量转录组分析中的应用以及与传统方法的比较尚未实现。我们比较了四种主要的降维方法(PCA、多维尺度分析[MDS]、t-SNE 和 UMAP)在分析 71 个大型批量转录组数据集。UMAP 优于 PCA 和 MDS,但在区分批次效应、识别预定义的生物学组和在二维空间中揭示深入的聚类方面,优于 t-SNE。重要的是,UMAP 生成的样本聚类揭示了生物学特征和临床意义。我们建议在可视化和分析大量批量转录组数据集中部署 UMAP,以加强样本异质性分析。

相似文献

1
Dimensionality reduction by UMAP reinforces sample heterogeneity analysis in bulk transcriptomic data.UMAP 通过降维增强了批量转录组数据中样本异质性分析。
Cell Rep. 2021 Jul 27;36(4):109442. doi: 10.1016/j.celrep.2021.109442.
2
A cross entropy test allows quantitative statistical comparison of t-SNE and UMAP representations.交叉熵测试允许对 t-SNE 和 UMAP 表示进行定量统计比较。
Cell Rep Methods. 2023 Jan 13;3(1):100390. doi: 10.1016/j.crmeth.2022.100390. eCollection 2023 Jan 23.
3
Evaluation of Distance Metrics and Spatial Autocorrelation in Uniform Manifold Approximation and Projection Applied to Mass Spectrometry Imaging Data.基于均摊近似和投影的距离度量和空间自相关评估及其在质谱成像数据中的应用。
Anal Chem. 2019 May 7;91(9):5706-5714. doi: 10.1021/acs.analchem.8b05827. Epub 2019 Apr 25.
4
Capturing discrete latent structures: choose LDs over PCs.捕捉离散潜在结构:选择潜在因子而非主成分。
Biostatistics. 2022 Dec 12;24(1):1-16. doi: 10.1093/biostatistics/kxab030.
5
UMAP as a Dimensionality Reduction Tool for Molecular Dynamics Simulations of Biomacromolecules: A Comparison Study.UMAP 作为生物大分子分子动力学模拟的降维工具:一项对比研究。
J Phys Chem B. 2021 May 20;125(19):5022-5034. doi: 10.1021/acs.jpcb.1c02081. Epub 2021 May 11.
6
Dimensionality Reduction of Single-Cell RNA-Seq Data.单细胞 RNA-Seq 数据的降维处理。
Methods Mol Biol. 2021;2284:331-342. doi: 10.1007/978-1-0716-1307-8_18.
7
DGCyTOF: Deep learning with graphic cluster visualization to predict cell types of single cell mass cytometry data.DGCyTOF:基于图形聚类可视化的深度学习,用于预测单细胞质谱流式细胞术数据的细胞类型。
PLoS Comput Biol. 2022 Apr 11;18(4):e1008885. doi: 10.1371/journal.pcbi.1008885. eCollection 2022 Apr.
8
Visualizing Single-Cell RNA-seq Data with Semisupervised Principal Component Analysis.基于半监督主成分分析的单细胞 RNA-seq 数据可视化
Int J Mol Sci. 2020 Aug 12;21(16):5797. doi: 10.3390/ijms21165797.
9
Preprocessing of Single Cell RNA Sequencing Data Using Correlated Clustering and Projection.单细胞 RNA 测序数据的相关聚类和投影预处理。
J Chem Inf Model. 2024 Apr 8;64(7):2829-2838. doi: 10.1021/acs.jcim.3c00674. Epub 2023 Jul 4.
10
The application of Uniform Manifold Approximation and Projection (UMAP) for unconstrained ordination and classification of biological indicators in aquatic ecology.统一流形逼近和投影(UMAP)在水生生态学中生物指标的无约束排序和分类中的应用。
Sci Total Environ. 2022 Apr 1;815:152365. doi: 10.1016/j.scitotenv.2021.152365. Epub 2021 Dec 25.

引用本文的文献

1
Benchmarking of dimensionality reduction methods to capture drug response in transcriptome data.用于在转录组数据中捕获药物反应的降维方法基准测试。
Sci Rep. 2025 Sep 1;15(1):32173. doi: 10.1038/s41598-025-12021-7.
2
Intricate interactions between fine-scale genetic structure, lifestyle, and dietary habits in the Japanese population.日本人群中精细尺度遗传结构、生活方式和饮食习惯之间的复杂相互作用。
Commun Biol. 2025 Jul 12;8(1):1046. doi: 10.1038/s42003-025-08479-w.
3
Multi-omics decodes host-specific and environmental microbiome interactions in sepsis.
多组学解析脓毒症中宿主特异性和环境微生物组的相互作用。
Front Microbiol. 2025 Jun 26;16:1618177. doi: 10.3389/fmicb.2025.1618177. eCollection 2025.
4
Overcoming Preservation Challenges to Enable Single-Cell Proteomics of Fixed Cells and Tissue Samples with Retained Proteome Integrity.克服保存挑战,实现固定细胞和组织样本的单细胞蛋白质组学分析并保持蛋白质组完整性。
J Proteome Res. 2025 Jul 4;24(7):3666-3682. doi: 10.1021/acs.jproteome.5c00268. Epub 2025 Jun 19.
5
Development of a Diagnostic Prediction Model for Post-Stroke Cognitive Impairment in Acute Large Vessel Occlusion Stroke Using Multimodal MRI and PET/CT: A Study Protocol.使用多模态MRI和PET/CT开发急性大血管闭塞性卒中后认知障碍的诊断预测模型:一项研究方案
Brain Behav. 2025 Jun;15(6):e70613. doi: 10.1002/brb3.70613.
6
The impact of dropouts in scRNAseq dense neighborhood analysis.单细胞RNA测序密集邻域分析中缺失数据的影响。
Comput Struct Biotechnol J. 2025 Mar 24;27:1278-1285. doi: 10.1016/j.csbj.2025.03.033. eCollection 2025.
7
Exploring and mitigating shortcomings in single-cell differential expression analysis with a new statistical paradigm.用一种新的统计范式探索和缓解单细胞差异表达分析中的缺点。
Genome Biol. 2025 Mar 17;26(1):58. doi: 10.1186/s13059-025-03525-6.
8
An antibody developability triaging pipeline exploiting protein language models.一种利用蛋白质语言模型的抗体可开发性分类流程。
MAbs. 2025 Dec;17(1):2472009. doi: 10.1080/19420862.2025.2472009. Epub 2025 Mar 4.
9
Evolution of AI enabled healthcare systems using textual data with a pretrained BERT deep learning model.使用预训练的BERT深度学习模型的文本数据实现人工智能驱动的医疗保健系统的演进。
Sci Rep. 2025 Mar 4;15(1):7540. doi: 10.1038/s41598-025-91622-8.
10
Population-level transitions in observed difficulties through childhood and adolescence.在儿童期和青少年期观察到的困难在人群层面上的转变。
Dev Psychol. 2025 Aug;61(8):1495-1515. doi: 10.1037/dev0001874. Epub 2025 Feb 27.