Institute for Protein Research, Osaka University, Suita, Osaka 565-0871, Japan.
Division of Health Sciences, Osaka University Graduate School of Medicine, Suita, Osaka 565-0871, Japan.
Bioinformatics. 2022 Sep 15;38(18):4330-4336. doi: 10.1093/bioinformatics/btac541.
Single-cell RNA sequencing (scRNA-seq) analysis reveals heterogeneity and dynamic cell transitions. However, conventional gene-based analyses require intensive manual curation to interpret biological implications of computational results. Hence, a theory for efficiently annotating individual cells remains warranted.
We present ASURAT, a computational tool for simultaneously performing unsupervised clustering and functional annotation of disease, cell type, biological process and signaling pathway activity for single-cell transcriptomic data, using a correlation graph decomposition for genes in database-derived functional terms. We validated the usability and clustering performance of ASURAT using scRNA-seq datasets for human peripheral blood mononuclear cells, which required fewer manual curations than existing methods. Moreover, we applied ASURAT to scRNA-seq and spatial transcriptome datasets for human small cell lung cancer and pancreatic ductal adenocarcinoma, respectively, identifying previously overlooked subpopulations and differentially expressed genes. ASURAT is a powerful tool for dissecting cell subpopulations and improving biological interpretability of complex and noisy transcriptomic data.
ASURAT is published on Bioconductor (https://doi.org/10.18129/B9.bioc.ASURAT). The codes for analyzing data in this article are available at Github (https://github.com/keita-iida/ASURATBI) and figshare (https://doi.org/10.6084/m9.figshare.19200254.v4).
Supplementary data are available at Bioinformatics online.
单细胞 RNA 测序 (scRNA-seq) 分析揭示了细胞的异质性和动态转变。然而,传统的基于基因的分析需要进行密集的人工整理,以解释计算结果的生物学意义。因此,仍然需要一种有效地注释单个细胞的理论。
我们提出了 ASURAT,这是一种计算工具,用于使用基于数据库的功能术语中的基因相关图分解,同时对单细胞转录组数据进行无监督聚类和疾病、细胞类型、生物过程和信号通路活性的功能注释。我们使用人类外周血单核细胞的 scRNA-seq 数据集验证了 ASURAT 的可用性和聚类性能,其所需的人工整理比现有方法少。此外,我们分别将 ASURAT 应用于人类小细胞肺癌和胰腺导管腺癌的 scRNA-seq 和空间转录组数据集,鉴定了以前被忽视的亚群和差异表达基因。ASURAT 是一种强大的工具,可用于剖析细胞亚群,并提高复杂和嘈杂转录组数据的生物学可解释性。
ASURAT 已在 Bioconductor 上发布(https://doi.org/10.18129/B9.bioc.ASURAT)。本文中分析数据的代码可在 Github(https://github.com/keita-iida/ASURATBI)和 figshare(https://doi.org/10.6084/m9.figshare.19200254.v4)上获得。
补充数据可在 Bioinformatics 在线获得。