Center for Applied Statistics and School of Statistics, Renmin University of China, Beijing, China.
RSS and China-Re Life Joint Lab on Public Health and Risk Management, Renmin University of China, Beijing, China.
Genome Biol. 2023 Sep 11;24(1):208. doi: 10.1186/s13059-023-03046-0.
Clustering is a critical component of single-cell RNA sequencing (scRNA-seq) data analysis and can help reveal cell types and infer cell lineages. Despite considerable successes, there are few methods tailored to investigating cluster-specific genes contributing to cell heterogeneity, which can promote biological understanding of cell heterogeneity. In this study, we propose a zero-inflated negative binomial mixture model (ZINBMM) that simultaneously achieves effective scRNA-seq data clustering and gene selection. ZINBMM conducts a systemic analysis on raw counts, accommodating both batch effects and dropout events. Simulations and the analysis of five scRNA-seq datasets demonstrate the practical applicability of ZINBMM.
聚类是单细胞 RNA 测序 (scRNA-seq) 数据分析的关键组成部分,有助于揭示细胞类型并推断细胞谱系。尽管已经取得了相当大的成功,但很少有方法专门用于研究导致细胞异质性的特定于簇的基因,这可以促进对细胞异质性的生物学理解。在这项研究中,我们提出了一种零膨胀负二项混合模型 (ZINBMM),该模型可同时实现有效的 scRNA-seq 数据聚类和基因选择。ZINBMM 对原始计数进行系统分析,同时适应批次效应和缺失事件。模拟和五个 scRNA-seq 数据集的分析证明了 ZINBMM 的实际适用性。