Rentzsch Philipp, Kollotzek Aaron, Mohammadi Pejman, Lappalainen Tuuli
Science for Life Laboratory, Department of Gene Technology, KTH Royal Institute of Technology, Solna, Sweden.
Center for Immunity and Immunotherapies, Seattle Children's Research Institute, Seattle, WA, USA; Department of Pediatrics, University of Washington School of Medicine, Seattle, WA, USA; Department of Genome Science, University of Washington, Seattle, WA, USA.
bioRxiv. 2024 Apr 10:2024.04.10.588830. doi: 10.1101/2024.04.10.588830.
Differential expression (DE) analysis is a widely used method for identifying genes that are functionally relevant for an observed phenotype or biological response. However, typical DE analysis includes selection of genes based on a threshold of fold change in expression under the implicit assumption that all genes are equally sensitive to dosage changes of their transcripts. This tends to favor highly variable genes over more constrained genes where even small changes in expression may be biologically relevant. To address this limitation, we have developed a method to recalibrate each gene's differential expression fold change based on genetic expression variance observed in the human population. The newly established metric ranks statistically differentially expressed genes not by nominal change of expression, but by relative change in comparison to natural dosage variation for each gene. We apply our method to RNA sequencing datasets from rare disease and in-vitro stimulus response experiments. Compared to the standard approach, our method adjusts the bias in discovery towards highly variable genes, and enriches for pathways and biological processes related to metabolic and regulatory activity, indicating a prioritization of functionally relevant driver genes. With that, our method provides a novel view on DE and contributes towards bridging the existing gap between statistical and biological significance. We believe that this approach will simplify the identification of disease causing genes and enhance the discovery of therapeutic targets.
差异表达(DE)分析是一种广泛应用的方法,用于识别与观察到的表型或生物学反应功能相关的基因。然而,典型的DE分析包括基于表达倍数变化阈值来选择基因,其隐含假设是所有基因对其转录本剂量变化的敏感性相同。这往往有利于高变异性基因而非更受限制的基因,在后者中,即使表达的微小变化也可能具有生物学相关性。为了解决这一局限性,我们开发了一种方法,根据在人类群体中观察到的基因表达方差重新校准每个基因的差异表达倍数变化。新建立的指标对统计上差异表达的基因进行排名,不是依据表达的名义变化,而是依据与每个基因的自然剂量变异相比的相对变化。我们将我们的方法应用于来自罕见病和体外刺激反应实验的RNA测序数据集。与标准方法相比,我们的方法调整了对高变异性基因的发现偏差,并富集了与代谢和调节活性相关的途径和生物学过程,表明对功能相关驱动基因进行了优先排序。由此,我们的方法为差异表达提供了一种新视角,并有助于弥合统计显著性与生物学显著性之间的现有差距。我们相信这种方法将简化致病基因的识别,并加强治疗靶点的发现。