School of Chemistry and Molecular Biosciences, University of Queensland, Brisbane, QLD 4072, Australia.
School of Biomedical Sciences, University of Queensland, Brisbane, QLD 4072, Australia.
Bioinformatics. 2023 Jul 1;39(7). doi: 10.1093/bioinformatics/btad435.
Identification of cell types using single-cell RNA-seq is revolutionizing the study of multicellular organisms. However, typical single-cell RNA-seq analysis often involves post hoc manual curation to ensure clusters are transcriptionally distinct, which is time-consuming, error-prone, and irreproducible.
To overcome these obstacles, we developed Cytocipher, a bioinformatics method and scverse compatible software package that statistically determines significant clusters. Application of Cytocipher to normal tissue, development, disease, and large-scale atlas data reveals the broad applicability and power of Cytocipher to generate biological insights in numerous contexts. This included the identification of cell types not previously described in the datasets analysed, such as CD8+ T cell subtypes in human peripheral blood mononuclear cells; cell lineage intermediate states during mouse pancreas development; and subpopulations of luminal epithelial cells over-represented in prostate cancer. Cytocipher also scales to large datasets with high-test performance, as shown by application to the Tabula Sapiens Atlas representing >480 000 cells. Cytocipher is a novel and generalizable method that statistically determines transcriptionally distinct and programmatically reproducible clusters from single-cell data.
The software version used for this manuscript has been deposited on Zenodo (https://doi.org/10.5281/zenodo.8089546), and is also available via github (https://github.com/BradBalderson/Cytocipher).
使用单细胞 RNA-seq 鉴定细胞类型正在彻底改变多细胞生物的研究。然而,典型的单细胞 RNA-seq 分析通常需要事后手动整理,以确保聚类在转录上是不同的,这既耗时、易错又不可重复。
为了克服这些障碍,我们开发了 Cytocipher,这是一种生物信息学方法和 scverse 兼容的软件包,可从统计学上确定显著聚类。将 Cytocipher 应用于正常组织、发育、疾病和大规模图谱数据揭示了 Cytocipher 在许多情况下生成生物学见解的广泛适用性和强大功能。这包括鉴定以前在分析的数据集未描述的细胞类型,例如人外周血单核细胞中的 CD8+ T 细胞亚型;小鼠胰腺发育过程中的细胞谱系中间状态;以及在前列腺癌中过度表达的腔上皮细胞亚群。Cytocipher 还可扩展到具有高测试性能的大型数据集,如对代表超过 480000 个细胞的 Tabula Sapiens 图谱的应用。Cytocipher 是一种新颖且可推广的方法,可从单细胞数据中统计确定转录上不同且可通过编程重现的聚类。
用于本文的软件版本已在 Zenodo 上(https://doi.org/10.5281/zenodo.8089546)存档,也可通过 github(https://github.com/BradBalderson/Cytocipher)获得。