Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD, Australia.
School of Biomedical Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong SAR, China.
Nucleic Acids Res. 2023 Jun 23;51(11):e62. doi: 10.1093/nar/gkad307.
Methods for cell clustering and gene expression from single-cell RNA sequencing (scRNA-seq) data are essential for biological interpretation of cell processes. Here, we present TRIAGE-Cluster which uses genome-wide epigenetic data from diverse bio-samples to identify genes demarcating cell diversity in scRNA-seq data. By integrating patterns of repressive chromatin deposited across diverse cell types with weighted density estimation, TRIAGE-Cluster determines cell type clusters in a 2D UMAP space. We then present TRIAGE-ParseR, a machine learning method which evaluates gene expression rank lists to define gene groups governing the identity and function of cell types. We demonstrate the utility of this two-step approach using atlases of in vivo and in vitro cell diversification and organogenesis. We also provide a web accessible dashboard for analysis and download of data and software. Collectively, genome-wide epigenetic repression provides a versatile strategy to define cell diversity and study gene regulation of scRNA-seq data.
从单细胞 RNA 测序(scRNA-seq)数据中进行细胞聚类和基因表达的方法对于理解细胞过程的生物学意义至关重要。在这里,我们介绍了 TRIAGE-Cluster,它使用来自各种生物样本的全基因组表观遗传数据来识别 scRNA-seq 数据中标记细胞多样性的基因。通过整合不同细胞类型中沉积的抑制性染色质模式与加权密度估计,TRIAGE-Cluster 可以在 2D UMAP 空间中确定细胞类型簇。然后,我们介绍了 TRIAGE-ParseR,这是一种机器学习方法,用于评估基因表达排名列表,以定义控制细胞类型身份和功能的基因组。我们使用体内和体外细胞多样化和器官发生的图谱来展示这种两步方法的实用性。我们还提供了一个可访问的网络仪表板,用于分析和下载数据和软件。总的来说,全基因组表观遗传抑制提供了一种通用的策略来定义细胞多样性,并研究 scRNA-seq 数据的基因调控。