State Key Laboratory of Common Mechanism Research for Major Diseases, Suzhou Institute of Systems Medicine, Chinese Academy of Medical Sciences & Peking Union Medical College, Suzhou, Jiangsu, China.
Front Immunol. 2023 Jul 20;14:1223471. doi: 10.3389/fimmu.2023.1223471. eCollection 2023.
Accurately identifying immune cell types in single-cell RNA-sequencing (scRNA-Seq) data is critical to uncovering immune responses in health or disease conditions. However, the high heterogeneity and sparsity of scRNA-Seq data, as well as the similarity in gene expression among immune cell types, poses a great challenge for accurate identification of immune cell types in scRNA-Seq data. Here, we developed a tool named sc-ImmuCC for hierarchical annotation of immune cell types from scRNA-Seq data, based on the optimized gene sets and ssGSEA algorithm. sc-ImmuCC simulates the natural differentiation of immune cells, and the hierarchical annotation includes three layers, which can annotate nine major immune cell types and 29 cell subtypes. The test results showed its stable performance and strong consistency among different tissue datasets with average accuracy of 71-90%. In addition, the optimized gene sets and hierarchical annotation strategy could be applied to other methods to improve their annotation accuracy and the spectrum of annotated cell types and subtypes. We also applied sc-ImmuCC to a dataset composed of COVID-19, influenza, and healthy donors, and found that the proportion of monocytes in patients with COVID-19 and influenza was significantly higher than that in healthy people. The easy-to-use sc-ImmuCC tool provides a good way to comprehensively annotate immune cell types from scRNA-Seq data, and will also help study the immune mechanism underlying physiological and pathological conditions.
准确识别单细胞 RNA 测序 (scRNA-Seq) 数据中的免疫细胞类型对于揭示健康或疾病状态下的免疫反应至关重要。然而,scRNA-Seq 数据的高度异质性和稀疏性,以及免疫细胞类型之间基因表达的相似性,给 scRNA-Seq 数据中免疫细胞类型的准确识别带来了巨大挑战。在这里,我们开发了一种名为 sc-ImmuCC 的工具,用于基于优化的基因集和 ssGSEA 算法对 scRNA-Seq 数据中的免疫细胞类型进行层次注释。sc-ImmuCC 模拟免疫细胞的自然分化,层次注释包括三个层次,可以注释九个主要的免疫细胞类型和 29 个细胞亚型。测试结果表明,其性能稳定,在不同组织数据集之间具有很强的一致性,平均准确率为 71-90%。此外,优化的基因集和层次注释策略可应用于其他方法,以提高其注释准确性和注释细胞类型和亚型的范围。我们还将 sc-ImmuCC 应用于由 COVID-19、流感和健康供体组成的数据集,发现 COVID-19 和流感患者单核细胞的比例明显高于健康人。易于使用的 sc-ImmuCC 工具为全面注释 scRNA-Seq 数据中的免疫细胞类型提供了一种很好的方法,也将有助于研究生理和病理条件下的免疫机制。