Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia.
Department of Computer Science and Engineering (CSE), The Chinese University of Hong Kong (CUHK), Hong Kong SAR, China.
Brief Bioinform. 2022 Sep 20;23(5). doi: 10.1093/bib/bbac377.
We present a novel self-supervised Contrastive LEArning framework for single-cell ribonucleic acid (RNA)-sequencing (CLEAR) data representation and the downstream analysis. Compared with current methods, CLEAR overcomes the heterogeneity of the experimental data with a specifically designed representation learning task and thus can handle batch effects and dropout events simultaneously. It achieves superior performance on a broad range of fundamental tasks, including clustering, visualization, dropout correction, batch effect removal, and pseudo-time inference. The proposed method successfully identifies and illustrates inflammatory-related mechanisms in a COVID-19 disease study with 43 695 single cells from peripheral blood mononuclear cells.
我们提出了一种新颖的基于对比学习的单细胞 RNA 测序 (CLEAR) 数据表示和下游分析的自监督框架。与当前的方法相比,CLEAR 通过专门设计的表示学习任务克服了实验数据的异质性,因此可以同时处理批次效应和丢包事件。它在广泛的基础任务上表现出色,包括聚类、可视化、丢包校正、批次效应去除和伪时间推断。该方法成功地识别并说明了 COVID-19 疾病研究中与炎症相关的机制,该研究使用了来自外周血单核细胞的 43695 个单细胞。