• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

批次校正的投影 t-SNE。

Projected t-SNE for batch correction.

机构信息

Department of Statistical Sciences, University of Padova, Padova 35121, Italy.

RENCI, University of North Carolina, Chapel Hill, NC 27517, USA.

出版信息

Bioinformatics. 2020 Jun 1;36(11):3522-3527. doi: 10.1093/bioinformatics/btaa189.

DOI:10.1093/bioinformatics/btaa189
PMID:32176244
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7267829/
Abstract

MOTIVATION

Low-dimensional representations of high-dimensional data are routinely employed in biomedical research to visualize, interpret and communicate results from different pipelines. In this article, we propose a novel procedure to directly estimate t-SNE embeddings that are not driven by batch effects. Without correction, interesting structure in the data can be obscured by batch effects. The proposed algorithm can therefore significantly aid visualization of high-dimensional data.

RESULTS

The proposed methods are based on linear algebra and constrained optimization, leading to efficient algorithms and fast computation in many high-dimensional settings. Results on artificial single-cell transcription profiling data show that the proposed procedure successfully removes multiple batch effects from t-SNE embeddings, while retaining fundamental information on cell types. When applied to single-cell gene expression data to investigate mouse medulloblastoma, the proposed method successfully removes batches related with mice identifiers and the date of the experiment, while preserving clusters of oligodendrocytes, astrocytes, and endothelial cells and microglia, which are expected to lie in the stroma within or adjacent to the tumours.

AVAILABILITY AND IMPLEMENTATION

Source code implementing the proposed approach is available as an R package at https://github.com/emanuelealiverti/BC_tSNE, including a tutorial to reproduce the simulation studies.

CONTACT

aliverti@stat.unipd.it.

摘要

动机

低维数据表示法在生物医学研究中被常规用于可视化、解释和交流来自不同管道的结果。在本文中,我们提出了一种新的方法,可以直接估计不受批次效应影响的 t-SNE 嵌入。未经校正,数据中的有趣结构可能会被批次效应所掩盖。因此,该算法可以极大地帮助可视化高维数据。

结果

所提出的方法基于线性代数和约束优化,在许多高维环境中导致了高效的算法和快速计算。在人工单细胞转录谱数据上的结果表明,所提出的方法成功地从 t-SNE 嵌入中去除了多个批次效应,同时保留了关于细胞类型的基本信息。当应用于单细胞基因表达数据以研究小鼠成神经管细胞瘤时,该方法成功地去除了与小鼠标识符和实验日期相关的批次,同时保留了少突胶质细胞、星形胶质细胞、内皮细胞和小胶质细胞的簇,这些细胞预计位于肿瘤内或附近的基质中。

可用性和实现

实现所提出方法的源代码可在 https://github.com/emanuelealiverti/BC_tSNE 上作为 R 包获得,包括一个用于重现模拟研究的教程。

联系方式

aliverti@stat.unipd.it。

相似文献

1
Projected t-SNE for batch correction.批次校正的投影 t-SNE。
Bioinformatics. 2020 Jun 1;36(11):3522-3527. doi: 10.1093/bioinformatics/btaa189.
2
Fast interpolation-based t-SNE for improved visualization of single-cell RNA-seq data.基于快速插值的 t-SNE 用于改善单细胞 RNA-seq 数据的可视化。
Nat Methods. 2019 Mar;16(3):243-245. doi: 10.1038/s41592-018-0308-4. Epub 2019 Feb 11.
3
A generalization of t-SNE and UMAP to single-cell multimodal omics.单细胞多模态组学中 t-SNE 和 UMAP 的推广
Genome Biol. 2021 May 3;22(1):130. doi: 10.1186/s13059-021-02356-5.
4
Compound-SNE: Comparative alignment of t-SNEs for multiple single-cell omics data visualisation.Compound-SNE:用于多单细胞组学数据可视化的t-SNE比较比对
Bioinformatics. 2024 Jul 25;40(7). doi: 10.1093/bioinformatics/btae471.
5
Integration, exploration, and analysis of high-dimensional single-cell cytometry data using Spectre.使用Spectre对高维单细胞细胞计数数据进行整合、探索和分析。
Cytometry A. 2022 Mar;101(3):237-253. doi: 10.1002/cyto.a.24350. Epub 2021 Apr 26.
6
Mitigating the adverse impact of batch effects in sample pattern detection.减轻样本模式检测中批次效应的不利影响。
Bioinformatics. 2018 Aug 1;34(15):2634-2641. doi: 10.1093/bioinformatics/bty117.
7
Detecting hidden batch factors through data-adaptive adjustment for biological effects.通过数据自适应调整检测生物效应中的隐藏批次因素。
Bioinformatics. 2018 Apr 1;34(7):1141-1147. doi: 10.1093/bioinformatics/btx635.
8
Polled Digital Cell Sorter (p-DCS): Automatic identification of hematological cell types from single cell RNA-sequencing clusters.流式数字细胞分选仪(p-DCS):从单细胞 RNA 测序簇中自动识别血细胞类型。
BMC Bioinformatics. 2019 Jul 1;20(1):369. doi: 10.1186/s12859-019-2951-x.
9
MultiBaC: an R package to remove batch effects in multi-omic experiments.MultiBaC:一个用于去除多组学实验中批次效应的 R 包。
Bioinformatics. 2022 Apr 28;38(9):2657-2658. doi: 10.1093/bioinformatics/btac132.
10
BatchFLEX: feature-level equalization of X-batch.BatchFLEX:X 批的特征级均衡。
Bioinformatics. 2024 Oct 1;40(10). doi: 10.1093/bioinformatics/btae587.

引用本文的文献

1
Immunosuppression and phenotypic plasticity in an atlas of human hepatocholangiocarcinoma.人类肝内胆管癌图谱中的免疫抑制与表型可塑性
Hepatobiliary Surg Nutr. 2024 Aug 1;13(4):586-603. doi: 10.21037/hbsn-23-400. Epub 2024 Jan 12.
2
Single-cell omics: experimental workflow, data analyses and applications.单细胞组学:实验工作流程、数据分析及应用
Sci China Life Sci. 2025 Jan;68(1):5-102. doi: 10.1007/s11427-023-2561-0. Epub 2024 Jul 23.
3
PARE: A framework for removal of confounding effects from any distance-based dimension reduction method.PARE:一种从任何基于距离的降维方法中去除混杂效应的框架。
PLoS Comput Biol. 2024 Jul 10;20(7):e1012241. doi: 10.1371/journal.pcbi.1012241. eCollection 2024 Jul.
4
Complex hierarchical structures in single-cell genomics data unveiled by deep hyperbolic manifold learning.通过深度双曲流形学习揭示单细胞基因组学数据中的复杂层次结构。
Genome Res. 2023 Feb;33(2):232-246. doi: 10.1101/gr.277068.122. Epub 2023 Feb 27.
5
Identification of tumor antigens and immune subtypes in breast cancer for mRNA vaccine development.用于mRNA疫苗开发的乳腺癌肿瘤抗原和免疫亚型的鉴定
Front Oncol. 2022 Sep 26;12:973712. doi: 10.3389/fonc.2022.973712. eCollection 2022.
6
EMBEDR: Distinguishing signal from noise in single-cell omics data.EMBEDR:在单细胞组学数据中区分信号与噪声。
Patterns (N Y). 2022 Feb 8;3(3):100443. doi: 10.1016/j.patter.2022.100443. eCollection 2022 Mar 11.
7
HSP90 Inhibitor 17-AAG Attenuates Nucleus Pulposus Inflammation and Catabolism Induced by M1-Polarized Macrophages.热休克蛋白90抑制剂17-AAG减轻M1极化巨噬细胞诱导的髓核炎症和分解代谢。
Front Cell Dev Biol. 2022 Jan 4;9:796974. doi: 10.3389/fcell.2021.796974. eCollection 2021.

本文引用的文献

1
Removing the influence of group variables in high-dimensional predictive modelling.消除高维预测建模中组变量的影响。
J R Stat Soc Ser A Stat Soc. 2021 Jul;184(3):791-811. doi: 10.1111/rssa.12613. Epub 2021 Apr 15.
2
Clustering with t-SNE, provably.使用t-SNE进行聚类,可证明。
SIAM J Math Data Sci. 2019;1(2):313-332. doi: 10.1137/18m1216134. Epub 2019 May 28.
3
scRNA-seq in medulloblastoma shows cellular heterogeneity and lineage expansion support resistance to SHH inhibitor therapy.单细胞 RNA 测序技术在髓母细胞瘤中的应用显示了细胞异质性和谱系扩增,支持对 SHH 抑制剂治疗的抵抗。
Nat Commun. 2019 Dec 20;10(1):5829. doi: 10.1038/s41467-019-13657-6.
4
The art of using t-SNE for single-cell transcriptomics.使用 t-SNE 进行单细胞转录组学分析的艺术。
Nat Commun. 2019 Nov 28;10(1):5416. doi: 10.1038/s41467-019-13056-x.
5
Fast, sensitive and accurate integration of single-cell data with Harmony.利用 Harmony 实现单细胞数据的快速、灵敏和精确整合。
Nat Methods. 2019 Dec;16(12):1289-1296. doi: 10.1038/s41592-019-0619-0. Epub 2019 Nov 18.
6
A systematic evaluation of single cell RNA-seq analysis pipelines.单细胞 RNA 测序分析流程的系统评价。
Nat Commun. 2019 Oct 11;10(1):4667. doi: 10.1038/s41467-019-12266-7.
7
Current best practices in single-cell RNA-seq analysis: a tutorial.单细胞 RNA 测序分析的当前最佳实践:教程。
Mol Syst Biol. 2019 Jun 19;15(6):e8746. doi: 10.15252/msb.20188746.
8
Childhood cerebellar tumours mirror conserved fetal transcriptional programs.儿童小脑肿瘤反映了保守的胎儿转录程序。
Nature. 2019 Aug;572(7767):67-73. doi: 10.1038/s41586-019-1158-7. Epub 2019 May 1.
9
Performance Assessment and Selection of Normalization Procedures for Single-Cell RNA-Seq.单细胞 RNA-Seq 数据标准化流程的性能评估与选择
Cell Syst. 2019 Apr 24;8(4):315-328.e8. doi: 10.1016/j.cels.2019.03.010.
10
PAGA: graph abstraction reconciles clustering with trajectory inference through a topology preserving map of single cells.PAGA:通过对单细胞进行拓扑保持映射,实现了聚类和轨迹推断的图抽象。
Genome Biol. 2019 Mar 19;20(1):59. doi: 10.1186/s13059-019-1663-x.