Suppr超能文献

使用单个参考空间对多个单细胞 RNA 测序数据集进行稳健整合。

Robust integration of multiple single-cell RNA sequencing datasets using a single reference space.

机构信息

Department of Genetics, Albert Einstein College of Medicine, Bronx, NY, USA.

Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, NY, USA.

出版信息

Nat Biotechnol. 2021 Jul;39(7):877-884. doi: 10.1038/s41587-021-00859-x. Epub 2021 Mar 25.

Abstract

In many biological applications of single-cell RNA sequencing (scRNA-seq), an integrated analysis of data from multiple batches or studies is necessary. Current methods typically achieve integration using shared cell types or covariance correlation between datasets, which can distort biological signals. Here we introduce an algorithm that uses the gene eigenvectors from a reference dataset to establish a global frame for integration. Using simulated and real datasets, we demonstrate that this approach, called Reference Principal Component Integration (RPCI), consistently outperforms other methods by multiple metrics, with clear advantages in preserving genuine cross-sample gene expression differences in matching cell types, such as those present in cells at distinct developmental stages or in perturbated versus control studies. Moreover, RPCI maintains this robust performance when multiple datasets are integrated. Finally, we applied RPCI to scRNA-seq data for mouse gut endoderm development and revealed temporal emergence of genetic programs helping establish the anterior-posterior axis in visceral endoderm.

摘要

在单细胞 RNA 测序 (scRNA-seq) 的许多生物学应用中,需要对来自多个批次或研究的数据进行综合分析。当前的方法通常使用共享的细胞类型或数据集之间的协方差相关性来实现集成,这可能会扭曲生物学信号。在这里,我们介绍了一种使用参考数据集的基因特征向量来建立集成全局框架的算法。使用模拟和真实数据集,我们证明了这种方法,称为参考主成分集成 (RPCI),通过多种指标始终优于其他方法,在保留真实的跨样本基因表达差异方面具有明显的优势,例如在不同发育阶段的细胞中或在扰动与对照研究中存在的差异。此外,当集成多个数据集时,RPCI 仍然保持这种稳健的性能。最后,我们将 RPCI 应用于小鼠肠道内胚层发育的 scRNA-seq 数据,并揭示了有助于在内胚层中建立前-后轴的遗传程序的时间出现。

相似文献

8
[A review on integration methods for single-cell data].[单细胞数据整合方法综述]
Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2021 Oct 25;38(5):1010-1017. doi: 10.7507/1001-5515.202104073.

引用本文的文献

本文引用的文献

1
Stereo3D: using stereo images to enrich 3D visualization.立体 3D:利用立体图像丰富 3D 可视化。
Bioinformatics. 2020 Aug 15;36(14):4189-4190. doi: 10.1093/bioinformatics/btaa521.
5
A novel approach to remove the batch effect of single-cell data.一种消除单细胞数据批次效应的新方法。
Cell Discov. 2019 Sep 24;5:46. doi: 10.1038/s41421-019-0114-x. eCollection 2019.
7
scGen predicts single-cell perturbation responses.scGen 预测单细胞扰动反应。
Nat Methods. 2019 Aug;16(8):715-721. doi: 10.1038/s41592-019-0494-8. Epub 2019 Jul 29.
9
A cellular atlas of dependent cardiac development.依赖型心脏发育的细胞图谱。
Development. 2019 Jun 14;146(12):dev180398. doi: 10.1242/dev.180398.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验