使用灵活的计算方法在单细胞和空间转录组学中检测显著表达模式。
Detecting significant expression patterns in single-cell and spatial transcriptomics with a flexible computational approach.
机构信息
Computer Science Department, Technion - Israel Institute of Technology, Haifa, Israel.
Faculty of Biology, Technion - Israel Institute of Technology, Haifa, Israel.
出版信息
Sci Rep. 2024 Oct 30;14(1):26121. doi: 10.1038/s41598-024-75314-3.
Gene expression data holds the potential to shed light on multiple biological processes at once. However, data analysis methods for single cell sequencing mostly focus on finding cell clusters or the principal progression line of the data. Data analysis for spatial transcriptomics mostly addresses clustering and finding spatially variable genes. Existing data analysis methods are effective in finding the main data features, but they might miss less pronounced, albeit significant, processes, possibly involving a subset of the samples. In this work we present SPIRAL: Significant Process InfeRence ALgorithm. SPIRAL is based on Gaussian statistics to detect all statistically significant biological processes in single cell, bulk and spatial transcriptomics data. The algorithm outputs a list of structures, each defined by a set of genes working simultaneously in a specific population of cells. SPIRAL is unique in its flexibility: the structures are constructed by selecting subsets of genes and cells based on statistically significant and consistent differential expression. Every gene and every cell may be part of one structure, more or none. SPIRAL also provides several visual representations of structures and pathway enrichment information. We validated the statistical soundness of SPIRAL on synthetic datasets and applied it to single cell, spatial and bulk RNA-sequencing datasets. SPIRAL is available at https://spiral.technion.ac.il/ .
基因表达数据有可能一次性揭示多个生物学过程。然而,单细胞测序的数据分析方法主要集中于寻找细胞群或数据的主要进化线。空间转录组学的数据分析主要解决聚类和寻找空间变化的基因。现有的数据分析方法在发现主要数据特征方面非常有效,但它们可能会错过不太明显但却很重要的过程,这些过程可能涉及样本的一部分。在这项工作中,我们提出了 SPIRAL:显著过程推断算法。SPIRAL 基于高斯统计,用于检测单细胞、批量和空间转录组学数据中的所有具有统计学意义的生物学过程。该算法输出一组结构,每个结构由一组同时在特定细胞群中起作用的基因定义。SPIRAL 的独特之处在于其灵活性:结构是通过基于统计学上显著和一致的差异表达选择基因和细胞的子集来构建的。每个基因和每个细胞都可以是一个结构的一部分,也可以是多个或没有。SPIRAL 还提供了结构和途径富集信息的几种可视化表示。我们在合成数据集上验证了 SPIRAL 的统计稳健性,并将其应用于单细胞、空间和批量 RNA-seq 数据集。SPIRAL 可在 https://spiral.technion.ac.il/ 获得。