Tianjin Cancer Institute, National Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin's Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China.
State Key Laboratory of Experimental Hematology, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Key Laboratory of Breast Cancer Prevention and Therapy (Ministry of Education), Key Laboratory of Immune Microenvironment and Disease (Ministry of Education), Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China.
Brief Bioinform. 2022 Sep 20;23(5). doi: 10.1093/bib/bbac327.
Integration of accumulative large-scale single-cell transcriptomes requires scalable batch-correction approaches. Here we propose Fugue, a simple and efficient batch-correction method that is scalable for integrating super large-scale single-cell transcriptomes from diverse sources. The core idea of the method is to encode batch information as trainable parameters and add it to single-cell expression profile; subsequently, a contrastive learning approach is used to learn feature representation of the additive expression profile. We demonstrate the scalability of Fugue by integrating all single cells obtained from the Human Cell Atlas. We benchmark Fugue against current state-of-the-art methods and show that Fugue consistently achieves improved performance in terms of data alignment and clustering preservation. Our study will facilitate the integration of single-cell transcriptomes at increasingly large scale.
整合累积的大规模单细胞转录组需要可扩展的批量校正方法。在这里,我们提出了 Fugue,这是一种简单而有效的批量校正方法,可用于整合来自不同来源的超大规模单细胞转录组。该方法的核心思想是将批量信息编码为可训练的参数并将其添加到单细胞表达谱中;随后,使用对比学习方法来学习加性表达谱的特征表示。我们通过整合人类细胞图谱中获得的所有单细胞来展示 Fugue 的可扩展性。我们将 Fugue 与当前最先进的方法进行基准测试,并表明 Fugue 在数据对齐和聚类保留方面始终能够实现更好的性能。我们的研究将促进在越来越大的规模上整合单细胞转录组。