Suppr超能文献

GLOBE:基于对比学习的整合单细胞转录组数据集的框架。

GLOBE: a contrastive learning-based framework for integrating single-cell transcriptome datasets.

机构信息

School of Computer Science and Engineering, Central South University, Changsha, 410083, China.

出版信息

Brief Bioinform. 2022 Sep 20;23(5). doi: 10.1093/bib/bbac311.

Abstract

Integration of single-cell transcriptome datasets from multiple sources plays an important role in investigating complex biological systems. The key to integration of transcriptome datasets is batch effect removal. Recent methods attempt to apply a contrastive learning strategy to correct batch effects. Despite their encouraging performance, the optimal contrastive learning framework for batch effect removal is still under exploration. We develop an improved contrastive learning-based batch correction framework, GLOBE. GLOBE defines adaptive translation transformations for each cell to guarantee the stability of approximating batch effects. To enhance the consistency of representations alignment, GLOBE utilizes a loss function that is both hardness-aware and consistency-aware to learn batch effect-invariant representations. Moreover, GLOBE computes batch-corrected gene matrix in a transparent approach to support diverse downstream analysis. Benchmarking results on a wide spectrum of datasets show that GLOBE outperforms other state-of-the-art methods in terms of robust batch mixing and superior conservation of biological signals. We further apply GLOBE to integrate two developing mouse neocortex datasets and show GLOBE succeeds in removing batch effects while preserving the contiguous structure of cells in raw data. Finally, a comprehensive study is conducted to validate the effectiveness of GLOBE.

摘要

整合来自多个来源的单细胞转录组数据集在研究复杂的生物系统中起着重要作用。转录组数据集整合的关键是去除批次效应。最近的方法试图应用对比学习策略来纠正批次效应。尽管它们的性能令人鼓舞,但去除批次效应的最佳对比学习框架仍在探索中。我们开发了一种改进的基于对比学习的批量校正框架 GLOBE。GLOBE 为每个细胞定义自适应翻译变换,以保证逼近批次效应的稳定性。为了增强表示对齐的一致性,GLOBE 利用一种既具有硬度意识又具有一致性意识的损失函数来学习批次效应不变的表示。此外,GLOBE 以透明的方式计算批量校正的基因矩阵,以支持各种下游分析。在广泛的数据集上的基准测试结果表明,GLOBE 在稳健的批量混合和优越的生物信号保留方面优于其他最先进的方法。我们进一步将 GLOBE 应用于整合两个正在发育的小鼠新皮层数据集,并表明 GLOBE 成功地去除了批次效应,同时保留了原始数据中细胞的连续结构。最后,进行了一项全面的研究来验证 GLOBE 的有效性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验