Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium.
Bioinformatics Institute Ghent, Ghent University, Ghent, Belgium.
Nat Commun. 2020 Mar 5;11(1):1201. doi: 10.1038/s41467-020-14766-3.
Trajectory inference has radically enhanced single-cell RNA-seq research by enabling the study of dynamic changes in gene expression. Downstream of trajectory inference, it is vital to discover genes that are (i) associated with the lineages in the trajectory, or (ii) differentially expressed between lineages, to illuminate the underlying biological processes. Current data analysis procedures, however, either fail to exploit the continuous resolution provided by trajectory inference, or fail to pinpoint the exact types of differential expression. We introduce tradeSeq, a powerful generalized additive model framework based on the negative binomial distribution that allows flexible inference of both within-lineage and between-lineage differential expression. By incorporating observation-level weights, the model additionally allows to account for zero inflation. We evaluate the method on simulated datasets and on real datasets from droplet-based and full-length protocols, and show that it yields biological insights through a clear interpretation of the data.
轨迹推断通过研究基因表达的动态变化,极大地促进了单细胞 RNA-seq 研究。在轨迹推断之后,发现与轨迹中的谱系相关的(i)或在谱系之间差异表达的(ii)基因至关重要,以阐明潜在的生物学过程。然而,当前的数据分析程序要么未能利用轨迹推断提供的连续分辨率,要么未能准确指出差异表达的类型。我们引入了 tradeSeq,这是一个基于负二项分布的强大广义加性模型框架,允许灵活推断谱系内和谱系间的差异表达。通过引入观测水平的权重,该模型还可以考虑零膨胀。我们在模拟数据集和基于液滴和全长方案的真实数据集上评估了该方法,并通过对数据的清晰解释表明,该方法可以产生生物学见解。