School of Computer Science and Engineering, Central South University, Changsha, China.
Xiangjiang Laboratory, Changsha, China.
Nat Commun. 2024 Aug 31;15(1):7561. doi: 10.1038/s41467-024-51891-9.
Single-cell RNA sequencing (scRNA-seq) technologies have become essential tools for characterizing cellular landscapes within complex tissues. Large-scale single-cell transcriptomics holds great potential for identifying rare cell types critical to the pathogenesis of diseases and biological processes. Existing methods for identifying rare cell types often rely on one-time clustering using partial or global gene expression. However, these rare cell types may be overlooked during the clustering phase, posing challenges for their accurate identification. In this paper, we propose a Cluster decomposition-based Anomaly Detection method (scCAD), which iteratively decomposes clusters based on the most differential signals in each cluster to effectively separate rare cell types and achieve accurate identification. We benchmark scCAD on 25 real-world scRNA-seq datasets, demonstrating its superior performance compared to 10 state-of-the-art methods. In-depth case studies across diverse datasets, including mouse airway, brain, intestine, human pancreas, immunology data, and clear cell renal cell carcinoma, showcase scCAD's efficiency in identifying rare cell types in complex biological scenarios. Furthermore, scCAD can correct the annotation of rare cell types and identify immune cell subtypes associated with disease, thereby offering valuable insights into disease progression.
单细胞 RNA 测序 (scRNA-seq) 技术已成为描绘复杂组织内细胞图谱的重要工具。大规模单细胞转录组学在鉴定对疾病发病机制和生物学过程至关重要的稀有细胞类型方面具有巨大潜力。识别稀有细胞类型的现有方法通常依赖于使用部分或全局基因表达进行一次性聚类。然而,在聚类阶段,这些稀有细胞类型可能会被忽略,从而对其准确识别构成挑战。在本文中,我们提出了一种基于聚类分解的异常检测方法 (scCAD),该方法基于每个聚类中最具差异的信号迭代地分解聚类,从而有效地分离稀有细胞类型并实现准确识别。我们在 25 个真实 scRNA-seq 数据集上对 scCAD 进行了基准测试,结果表明其性能优于 10 种最先进的方法。在包括小鼠气道、大脑、肠道、人类胰腺、免疫学数据和透明细胞肾细胞癌在内的多个数据集的深入案例研究中,展示了 scCAD 在识别复杂生物学场景中的稀有细胞类型方面的效率。此外,scCAD 可以纠正稀有细胞类型的注释,并识别与疾病相关的免疫细胞亚型,从而为疾病进展提供有价值的见解。