• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用单个参考空间对多个单细胞 RNA 测序数据集进行稳健整合。

Robust integration of multiple single-cell RNA sequencing datasets using a single reference space.

机构信息

Department of Genetics, Albert Einstein College of Medicine, Bronx, NY, USA.

Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, NY, USA.

出版信息

Nat Biotechnol. 2021 Jul;39(7):877-884. doi: 10.1038/s41587-021-00859-x. Epub 2021 Mar 25.

DOI:10.1038/s41587-021-00859-x
PMID:33767393
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8456427/
Abstract

In many biological applications of single-cell RNA sequencing (scRNA-seq), an integrated analysis of data from multiple batches or studies is necessary. Current methods typically achieve integration using shared cell types or covariance correlation between datasets, which can distort biological signals. Here we introduce an algorithm that uses the gene eigenvectors from a reference dataset to establish a global frame for integration. Using simulated and real datasets, we demonstrate that this approach, called Reference Principal Component Integration (RPCI), consistently outperforms other methods by multiple metrics, with clear advantages in preserving genuine cross-sample gene expression differences in matching cell types, such as those present in cells at distinct developmental stages or in perturbated versus control studies. Moreover, RPCI maintains this robust performance when multiple datasets are integrated. Finally, we applied RPCI to scRNA-seq data for mouse gut endoderm development and revealed temporal emergence of genetic programs helping establish the anterior-posterior axis in visceral endoderm.

摘要

在单细胞 RNA 测序 (scRNA-seq) 的许多生物学应用中,需要对来自多个批次或研究的数据进行综合分析。当前的方法通常使用共享的细胞类型或数据集之间的协方差相关性来实现集成,这可能会扭曲生物学信号。在这里,我们介绍了一种使用参考数据集的基因特征向量来建立集成全局框架的算法。使用模拟和真实数据集,我们证明了这种方法,称为参考主成分集成 (RPCI),通过多种指标始终优于其他方法,在保留真实的跨样本基因表达差异方面具有明显的优势,例如在不同发育阶段的细胞中或在扰动与对照研究中存在的差异。此外,当集成多个数据集时,RPCI 仍然保持这种稳健的性能。最后,我们将 RPCI 应用于小鼠肠道内胚层发育的 scRNA-seq 数据,并揭示了有助于在内胚层中建立前-后轴的遗传程序的时间出现。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9532/8456427/35968d6d25dc/nihms-1739743-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9532/8456427/889eacf066f4/nihms-1739743-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9532/8456427/65926261303e/nihms-1739743-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9532/8456427/ec6a7d070118/nihms-1739743-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9532/8456427/242a8703e329/nihms-1739743-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9532/8456427/35968d6d25dc/nihms-1739743-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9532/8456427/889eacf066f4/nihms-1739743-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9532/8456427/65926261303e/nihms-1739743-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9532/8456427/ec6a7d070118/nihms-1739743-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9532/8456427/242a8703e329/nihms-1739743-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9532/8456427/35968d6d25dc/nihms-1739743-f0005.jpg

相似文献

1
Robust integration of multiple single-cell RNA sequencing datasets using a single reference space.使用单个参考空间对多个单细胞 RNA 测序数据集进行稳健整合。
Nat Biotechnol. 2021 Jul;39(7):877-884. doi: 10.1038/s41587-021-00859-x. Epub 2021 Mar 25.
2
A multitask clustering approach for single-cell RNA-seq analysis in Recessive Dystrophic Epidermolysis Bullosa.一种用于隐性营养不良型大疱性表皮松解症的单细胞 RNA-seq 分析的多任务聚类方法。
PLoS Comput Biol. 2018 Apr 9;14(4):e1006053. doi: 10.1371/journal.pcbi.1006053. eCollection 2018 Apr.
3
FIRM: Flexible integration of single-cell RNA-sequencing data for large-scale multi-tissue cell atlas datasets.FIRM:单细胞 RNA 测序数据的灵活整合,适用于大规模多组织细胞图谱数据集。
Brief Bioinform. 2022 Sep 20;23(5). doi: 10.1093/bib/bbac167.
4
Accurate integration of multiple heterogeneous single-cell RNA-seq data sets by learning contrastive biological variation.通过学习对比生物变异性,实现多个异质单细胞 RNA-seq 数据集的精确整合。
Genome Res. 2023 May;33(5):750-762. doi: 10.1101/gr.277522.122. Epub 2023 Jun 12.
5
Domain adaptation for supervised integration of scRNA-seq data.监督整合 scRNA-seq 数据的领域自适应。
Commun Biol. 2023 Mar 16;6(1):274. doi: 10.1038/s42003-023-04668-7.
6
IMGG: Integrating Multiple Single-Cell Datasets through Connected Graphs and Generative Adversarial Networks.IMGG:通过连接图和生成对抗网络整合多个单细胞数据集。
Int J Mol Sci. 2022 Feb 14;23(4):2082. doi: 10.3390/ijms23042082.
7
A flexible network-based imputing-and-fusing approach towards the identification of cell types from single-cell RNA-seq data.一种基于灵活网络的推断融合方法,用于从单细胞 RNA-seq 数据中识别细胞类型。
BMC Bioinformatics. 2020 Jun 11;21(1):240. doi: 10.1186/s12859-020-03547-w.
8
[A review on integration methods for single-cell data].[单细胞数据整合方法综述]
Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2021 Oct 25;38(5):1010-1017. doi: 10.7507/1001-5515.202104073.
9
Propensity score matching enables batch-effect-corrected imputation in single-cell RNA-seq analysis.倾向得分匹配可实现单细胞 RNA-seq 分析中的批次效应校正填补。
Brief Bioinform. 2022 Jul 18;23(4). doi: 10.1093/bib/bbac275.
10
CIDR: Ultrafast and accurate clustering through imputation for single-cell RNA-seq data.CIDR:通过对单细胞RNA测序数据进行插补实现超快速且准确的聚类
Genome Biol. 2017 Mar 28;18(1):59. doi: 10.1186/s13059-017-1188-0.

引用本文的文献

1
A pluripotent stem cell atlas of multilineage differentiation.多谱系分化的多能干细胞图谱。
Sci Data. 2025 Jul 15;12(1):1238. doi: 10.1038/s41597-025-05549-w.
2
Machine learning-based prediction reveals kinase MAP4K4 regulates neutrophil differentiation through phosphorylating apoptosis-related proteins.基于机器学习的预测表明,激酶MAP4K4通过磷酸化凋亡相关蛋白来调节中性粒细胞分化。
PLoS Comput Biol. 2025 Mar 17;21(3):e1012877. doi: 10.1371/journal.pcbi.1012877. eCollection 2025 Mar.
3
Atlas of multilineage stem cell differentiation reveals TMEM88 as a developmental regulator of blood pressure.

本文引用的文献

1
Stereo3D: using stereo images to enrich 3D visualization.立体 3D:利用立体图像丰富 3D 可视化。
Bioinformatics. 2020 Aug 15;36(14):4189-4190. doi: 10.1093/bioinformatics/btaa521.
2
Systematic comparison of single-cell and single-nucleus RNA-sequencing methods.单细胞和单细胞核 RNA 测序方法的系统比较。
Nat Biotechnol. 2020 Jun;38(6):737-746. doi: 10.1038/s41587-020-0465-8. Epub 2020 Apr 6.
3
A benchmark of batch-effect correction methods for single-cell RNA sequencing data.单细胞 RNA 测序数据批次效应校正方法的基准测试。
多谱系干细胞分化图谱揭示TMEM88是血压的发育调节因子。
Nat Commun. 2025 Feb 4;16(1):1356. doi: 10.1038/s41467-025-56533-2.
4
A Single-Cell RNA Sequencing Atlas of the Chronic Obstructive Pulmonary Disease Distal Lung to Predict Cell-Cell Communication.用于预测细胞间通讯的慢性阻塞性肺疾病远端肺单细胞RNA测序图谱
Am J Respir Cell Mol Biol. 2025 Mar;72(3):332-335. doi: 10.1165/rcmb.2024-0232LE.
5
scDAPP: a comprehensive single-cell transcriptomics analysis pipeline optimized for cross-group comparison.scDAPP:一种为跨组比较优化的综合单细胞转录组学分析流程。
NAR Genom Bioinform. 2024 Sep 28;6(4):lqae134. doi: 10.1093/nargab/lqae134. eCollection 2024 Sep.
6
Molecular and network disruptions in neurodevelopment uncovered by single cell transcriptomics analysis of heterozygous cerebral organoids.通过对杂合脑类器官进行单细胞转录组学分析发现的神经发育中的分子和网络破坏
Heliyon. 2024 Jul 18;10(14):e34862. doi: 10.1016/j.heliyon.2024.e34862. eCollection 2024 Jul 30.
7
Macrophages in the infarcted heart acquire a fibrogenic phenotype, expressing matricellular proteins, but do not undergo fibroblast conversion.梗死心脏中的巨噬细胞获得成纤维表型,表达细胞基质蛋白,但不经历成纤维细胞转化。
J Mol Cell Cardiol. 2024 Nov;196:152-167. doi: 10.1016/j.yjmcc.2024.07.010. Epub 2024 Jul 31.
8
Integration mapping of cardiac fibroblast single-cell transcriptomes elucidates cellular principles of fibrosis in diverse pathologies.心脏成纤维细胞单细胞转录组的整合图谱阐明了多种病理纤维化的细胞原理。
Sci Adv. 2024 Jun 21;10(25):eadk8501. doi: 10.1126/sciadv.adk8501.
9
Comprehensive single cell transcriptomics analysis of murine osteosarcoma uncovers function in metastasis, genomic instability and immune activation and reveals additional target pathways.对小鼠骨肉瘤进行的全面单细胞转录组学分析揭示了其在转移、基因组不稳定性和免疫激活方面的功能,并揭示了其他靶标通路。
bioRxiv. 2024 Jun 6:2024.06.04.597347. doi: 10.1101/2024.06.04.597347.
10
Comprehensive integration of single-cell transcriptomic data illuminates the regulatory network architecture of plant cell fate specification.单细胞转录组数据的全面整合揭示了植物细胞命运决定的调控网络结构。
Plant Divers. 2024 Apr 3;46(3):372-385. doi: 10.1016/j.pld.2024.03.008. eCollection 2024 May.
Genome Biol. 2020 Jan 16;21(1):12. doi: 10.1186/s13059-019-1850-9.
4
Fast, sensitive and accurate integration of single-cell data with Harmony.利用 Harmony 实现单细胞数据的快速、灵敏和精确整合。
Nat Methods. 2019 Dec;16(12):1289-1296. doi: 10.1038/s41592-019-0619-0. Epub 2019 Nov 18.
5
A novel approach to remove the batch effect of single-cell data.一种消除单细胞数据批次效应的新方法。
Cell Discov. 2019 Sep 24;5:46. doi: 10.1038/s41421-019-0114-x. eCollection 2019.
6
BERMUDA: a novel deep transfer learning method for single-cell RNA sequencing batch correction reveals hidden high-resolution cellular subtypes.百慕大:一种新型的单细胞 RNA 测序批次校正深度迁移学习方法揭示了隐藏的高分辨率细胞亚型。
Genome Biol. 2019 Aug 12;20(1):165. doi: 10.1186/s13059-019-1764-6.
7
scGen predicts single-cell perturbation responses.scGen 预测单细胞扰动反应。
Nat Methods. 2019 Aug;16(8):715-721. doi: 10.1038/s41592-019-0494-8. Epub 2019 Jul 29.
8
Clonal replacement of tumor-specific T cells following PD-1 blockade.PD-1 阻断后肿瘤特异性 T 细胞的克隆性替换。
Nat Med. 2019 Aug;25(8):1251-1259. doi: 10.1038/s41591-019-0522-3. Epub 2019 Jul 29.
9
A cellular atlas of dependent cardiac development.依赖型心脏发育的细胞图谱。
Development. 2019 Jun 14;146(12):dev180398. doi: 10.1242/dev.180398.
10
Simulating multiple faceted variability in single cell RNA sequencing.模拟单细胞 RNA 测序中的多重多方面变异性。
Nat Commun. 2019 Jun 13;10(1):2611. doi: 10.1038/s41467-019-10500-w.