使用单细胞线性自适应负二项式表达(scLANE)测试进行可解释的轨迹推断。

Interpretable trajectory inference with single-cell Linear Adaptive Negative-binomial Expression (scLANE) testing.

作者信息

Leary Jack R, Bacher Rhonda

机构信息

Department of Biostatistics, College of Public Health and Health Professions, University of Florida, Gainesville, FL 32610, USA.

出版信息

bioRxiv. 2023 Dec 20:2023.12.19.572477. doi: 10.1101/2023.12.19.572477.

Abstract

The rapid proliferation of trajectory inference methods for single-cell RNA-seq data has allowed researchers to investigate complex biological processes by examining underlying gene expression dynamics. After estimating a latent cell ordering, statistical models are used to determine which genes exhibit changes in expression that are significantly associated with progression through the biological trajectory. While a few techniques for performing trajectory differential expression exist, most rely on the flexibility of generalized additive models in order to account for the inherent nonlinearity of changes in gene expression. As such, the results can be difficult to interpret, and biological conclusions often rest on subjective visual inspections of the most dynamic genes. To address this challenge, we propose scLANE testing, which is built around an interpretable generalized linear model and handles nonlinearity with basis splines chosen empirically for each gene. In addition, extensions to estimating equations and mixed models allow for reliable trajectory testing under complex experimental designs. After validating the accuracy of scLANE under several different simulation scenarios, we apply it to a set of diverse biological datasets and display its ability to provide novel biological information when used downstream of both pseudotime and RNA velocity estimation methods.

摘要

单细胞RNA测序数据轨迹推断方法的迅速激增,使研究人员能够通过检查潜在的基因表达动态来研究复杂的生物学过程。在估计潜在的细胞排序后,统计模型用于确定哪些基因的表达变化与通过生物学轨迹的进展显著相关。虽然存在一些执行轨迹差异表达的技术,但大多数依赖广义相加模型的灵活性,以考虑基因表达变化固有的非线性。因此,结果可能难以解释,生物学结论往往基于对最具动态变化基因的主观视觉检查。为应对这一挑战,我们提出了scLANE测试,它围绕一个可解释的广义线性模型构建,并通过为每个基因凭经验选择的基样条来处理非线性问题。此外,对估计方程和混合模型的扩展允许在复杂实验设计下进行可靠的轨迹测试。在几种不同的模拟场景下验证了scLANE的准确性后,我们将其应用于一组多样的生物学数据集,并展示了它在伪时间和RNA速度估计方法下游使用时提供新生物学信息的能力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd6f/12233584/b32b21e1c23d/nihpp-2023.12.19.572477v2-f0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索