在单细胞病例对照研究中，带有适当偏移量的伪总体具有与广义线性混合模型相同的统计性质。

Pseudobulk with proper offsets has the same statistical properties as generalized linear mixed models in single-cell case-control studies.

机构信息

Department of Medicine, Seoul National University College of Medicine, Seoul, 03080, Republic of Korea.

Department of Statistics, University of Michigan, Ann Arbor, 48109, United States.

出版信息

Bioinformatics. 2024 Aug 2;40(8). doi: 10.1093/bioinformatics/btae498.

DOI:10.1093/bioinformatics/btae498

PMID:39115884

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11343365/

Abstract

MOTIVATION

Generalized linear mixed models (GLMMs), such as the negative-binomial or Poisson linear mixed model, are widely applied to single-cell RNA sequencing data to compare transcript expression between different conditions determined at the subject level. However, the model is computationally intensive, and its relative statistical performance to pseudobulk approaches is poorly understood.

RESULTS

We propose offset-pseudobulk as a lightweight alternative to GLMMs. We prove that a count-based pseudobulk equipped with a proper offset variable has the same statistical properties as GLMMs in terms of both point estimates and standard errors. We confirm our findings using simulations based on real data. Offset-pseudobulk is substantially faster (>×10) and numerically more stable than GLMMs.

AVAILABILITY AND IMPLEMENTATION

Offset pseudobulk can be easily implemented in any generalized linear model software by tweaking a few options. The codes can be found at https://github.com/hanbin973/pseudobulk_is_mm.

摘要

动机

广义线性混合模型（GLMMs），如负二项式或泊松线性混合模型，广泛应用于单细胞 RNA 测序数据，以比较在主体水平上确定的不同条件下的转录物表达。然而，该模型计算密集，并且其相对于伪总体方法的相对统计性能尚不清楚。

结果

我们提出偏移伪总体作为 GLMMs 的轻量级替代方法。我们证明，基于计数的伪总体配备适当的偏移变量，在点估计和标准误差方面与 GLMMs 具有相同的统计特性。我们使用基于真实数据的模拟来证实我们的发现。偏移伪总体比 GLMMs 快得多（>×10），数值上也更稳定。

可用性和实现

通过调整几个选项，偏移伪总体可以轻松地在任何广义线性模型软件中实现。代码可以在 https://github.com/hanbin973/pseudobulk_is_mm 上找到。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

在单细胞病例对照研究中，带有适当偏移量的伪总体具有与广义线性混合模型相同的统计性质。

Pseudobulk with proper offsets has the same statistical properties as generalized linear mixed models in single-cell case-control studies.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

动机

结果

可用性和实现

相似文献

本文引用的文献

在单细胞病例对照研究中，带有适当偏移量的伪总体具有与广义线性混合模型相同的统计性质。

Pseudobulk with proper offsets has the same statistical properties as generalized linear mixed models in single-cell case-control studies.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

动机

结果

可用性和实现

相似文献

本文引用的文献