Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA.
Therapeutic Innovation Center, Baylor College of Medicine, Houston, TX 77030, USA.
Genomics Proteomics Bioinformatics. 2023 Apr;21(2):370-384. doi: 10.1016/j.gpb.2022.04.001. Epub 2022 Apr 22.
Single-cell RNA sequencing (scRNA-seq) is revolutionizing the study of complex and dynamic cellular mechanisms. However, cell type annotation remains a main challenge as it largely relies on a priori knowledge and manual curation, which is cumbersome and subjective. The increasing number of scRNA-seq datasets, as well as numerous published genetic studies, has motivated us to build a comprehensive human cell type reference atlas.Here, we present decoding Cell type Specificity (deCS), an automatic cell type annotation method augmented by a comprehensive collection of human cell type expression profiles and marker genes. We used deCS to annotate scRNA-seq data from various tissue types and systematically evaluated the annotation accuracy under different conditions, including reference panels, sequencing depth, and feature selection strategies. Our results demonstrate that expanding the references is critical for improving annotation accuracy. Compared to many existing state-of-the-art annotation tools, deCS significantly reduced computation time and increased accuracy. deCS can be integrated into the standard scRNA-seq analytical pipeline to enhance cell type annotation. Finally, we demonstrated the broad utility of deCS to identify trait-cell type associations in 51 human complex traits, providing deep insights into the cellular mechanisms underlying disease pathogenesis. All documents for deCS, including source code, user manual, demo data, and tutorials, are freely available at https://github.com/bsml320/deCS.
单细胞 RNA 测序 (scRNA-seq) 正在彻底改变复杂和动态细胞机制的研究。然而,细胞类型注释仍然是一个主要挑战,因为它主要依赖于先验知识和手动注释,这既繁琐又主观。越来越多的 scRNA-seq 数据集,以及众多已发表的遗传研究,促使我们构建了一个全面的人类细胞类型参考图谱。在这里,我们介绍了解码细胞特异性 (deCS),这是一种自动细胞类型注释方法,通过综合收集人类细胞类型表达谱和标记基因进行增强。我们使用 deCS 对来自各种组织类型的 scRNA-seq 数据进行注释,并在不同条件下(包括参考面板、测序深度和特征选择策略)系统地评估注释准确性。我们的结果表明,扩展参考对于提高注释准确性至关重要。与许多现有的最先进的注释工具相比,deCS 显著减少了计算时间并提高了准确性。deCS 可以集成到标准的 scRNA-seq 分析管道中,以增强细胞类型注释。最后,我们展示了 deCS 在识别 51 个人类复杂特征中的特征细胞类型关联方面的广泛适用性,为疾病发病机制的细胞机制提供了深入的见解。deCS 的所有文档,包括源代码、用户手册、演示数据和教程,均可在 https://github.com/bsml320/deCS 上免费获得。