Liu Xiao, Zhang Ting, Tan Ziyang, Warden Antony R, Li Shanhe, Cheung Edwin, Ding Xianting
Institute of Personalized Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200030 China.
Cancer Centre, Faculty of Health Sciences, University of Macau, Taipa, 999078 China.
Phenomics. 2022 May 19;2(5):323-335. doi: 10.1007/s43657-022-00056-z. eCollection 2022 Oct.
Although many methods have been developed to explore the function of cells by clustering high-dimensional (HD) single-cell omics data, the inconspicuously differential expressions of biomarkers of proteins or genes across all cells disturb the cell cluster delineation and downstream analysis. Here, we introduce a hashing-based framework to improve the delineation of cell clusters, which is based on the hypothesis that one variable with no significant differences can be decomposed into more diversely latent variables to distinguish cells. By projecting the original data into a sparse HD space, fly and densefly hashing preprocessing retain the local structure of data, and improve the cluster delineation of existing clustering methods, such as PhenoGraph. Moreover, the analyses on mass cytometry dataset show that our hashing-based framework manages to unveil new hidden heterogeneities in cell clusters. The proposed framework promotes the utilization of cell biomarkers and enriches the biological findings by introducing more latent variables.
The online version contains supplementary material available at 10.1007/s43657-022-00056-z.
尽管已经开发了许多方法,通过对高维(HD)单细胞组学数据进行聚类来探索细胞功能,但蛋白质或基因生物标志物在所有细胞中的差异表达不明显,这干扰了细胞聚类的划分和下游分析。在此,我们引入了一种基于哈希的框架来改进细胞聚类的划分,该框架基于这样一种假设:一个没有显著差异的变量可以分解为更多样化的潜在变量来区分细胞。通过将原始数据投影到稀疏高维空间中,fly和densefly哈希预处理保留了数据的局部结构,并改进了现有聚类方法(如PhenoGraph)的聚类划分。此外,对质谱流式细胞术数据集的分析表明,我们基于哈希的框架成功揭示了细胞聚类中新的隐藏异质性。所提出的框架通过引入更多潜在变量,促进了细胞生物标志物的利用,并丰富了生物学发现。
在线版本包含可在10.1007/s43657-022-00056-z获取的补充材料。