Duan Bin, Zhu Chenyu, Chuai Guohui, Tang Chen, Chen Xiaohan, Chen Shaoqi, Fu Shaliu, Li Gaoyang, Liu Qi
Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China.
Sci Adv. 2020 Oct 30;6(44). doi: 10.1126/sciadv.abd0855. Print 2020 Oct.
Efficient single-cell assignment without prior marker gene annotations is essential for single-cell sequencing data analysis. Current methods, however, have limited effectiveness for distinct single-cell assignment. They failed to achieve a well-generalized performance in different tasks because of the inherent heterogeneity of different single-cell sequencing datasets and different single-cell types. Furthermore, current methods are inefficient to identify novel cell types that are absent in the reference datasets. To this end, we present scLearn, a learning-based framework that automatically infers quantitative measurement/similarity and threshold that can be used for different single-cell assignment tasks, achieving a well-generalized assignment performance on different single-cell types. We evaluated scLearn on a comprehensive set of publicly available benchmark datasets. We proved that scLearn outperformed the comparable existing methods for single-cell assignment from various aspects, demonstrating state-of-the-art effectiveness with a reliable and generalized single-cell type identification and categorizing ability.
对于单细胞测序数据分析而言,无需事先进行标记基因注释的高效单细胞分配至关重要。然而,当前方法在不同单细胞分配方面的有效性有限。由于不同单细胞测序数据集和不同单细胞类型的固有异质性,它们在不同任务中未能实现良好的泛化性能。此外,当前方法在识别参考数据集中不存在的新型细胞类型方面效率低下。为此,我们提出了scLearn,这是一个基于学习的框架,它能自动推断可用于不同单细胞分配任务的定量测量/相似度和阈值,在不同单细胞类型上实现良好的泛化分配性能。我们在一组全面的公开可用基准数据集上对scLearn进行了评估。我们证明,scLearn在各个方面都优于现有的可比单细胞分配方法,展现出具有可靠且泛化的单细胞类型识别和分类能力的先进有效性。