Suppr超能文献

贝叶斯单细胞 RNA 测序数据图形模型中技术噪声的解释。

Accounting for technical noise in Bayesian graphical models of single-cell RNA-sequencing data.

机构信息

Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvannia, 423 Guardian Drive, Philadelphia, PA 19104, USA.

出版信息

Biostatistics. 2022 Dec 12;24(1):161-176. doi: 10.1093/biostatistics/kxab011.

Abstract

Single-cell RNA-sequencing (scRNAseq) data contain a high level of noise, especially in the form of zero-inflation, that is, the presence of an excessively large number of zeros. This is largely due to dropout events and amplification biases that occur in the preparation stage of single-cell experiments. Recent scRNAseq experiments have been augmented with unique molecular identifiers (UMI) and External RNA Control Consortium (ERCC) molecules which can be used to account for zero-inflation. However, most of the current methods on graphical models are developed under the assumption of the multivariate Gaussian distribution or its variants, and thus they are not able to adequately account for an excessively large number of zeros in scRNAseq data. In this article, we propose a single-cell latent graphical model (scLGM)-a Bayesian hierarchical model for estimating the conditional dependency network among genes using scRNAseq data. Taking advantage of UMI and ERCC data, scLGM explicitly models the two sources of zero-inflation. Our simulation study and real data analysis demonstrate that the proposed approach outperforms several existing methods.

摘要

单细胞 RNA 测序 (scRNAseq) 数据包含高水平的噪声,特别是零膨胀的形式,即存在大量的零值。这主要是由于单细胞实验准备阶段的丢包事件和扩增偏差引起的。最近的 scRNAseq 实验已经添加了独特分子标识符 (UMI) 和外部 RNA 对照协会 (ERCC) 分子,这些分子可用于解释零膨胀。然而,目前图形模型上的大多数方法都是在假设多元高斯分布或其变体的情况下开发的,因此它们不能充分考虑 scRNAseq 数据中大量的零值。在本文中,我们提出了一种单细胞潜在图形模型 (scLGM)——一种使用 scRNAseq 数据估计基因间条件依赖网络的贝叶斯层次模型。利用 UMI 和 ERCC 数据,scLGM 明确地对两种零膨胀源进行建模。我们的模拟研究和实际数据分析表明,所提出的方法优于几种现有的方法。

相似文献

7
Disease mapping of zero-excessive mesothelioma data in Flanders.比利时弗拉芒地区零超额间皮瘤数据的疾病地图绘制。
Ann Epidemiol. 2017 Jan;27(1):59-66.e3. doi: 10.1016/j.annepidem.2016.10.006. Epub 2016 Nov 1.
9
M3Drop: dropout-based feature selection for scRNASeq.M3Drop:用于单细胞RNA测序的基于缺失值的特征选择
Bioinformatics. 2019 Aug 15;35(16):2865-2867. doi: 10.1093/bioinformatics/bty1044.

本文引用的文献

1
GRAPHICAL MODELS FOR ZERO-INFLATED SINGLE CELL GENE EXPRESSION.零膨胀单细胞基因表达的图形模型
Ann Appl Stat. 2019 Jun;13(2):848-873. doi: 10.1214/18-AOAS1213. Epub 2019 Jun 17.
8
CONDITIONAL DISTANCE CORRELATION.条件距离相关性
J Am Stat Assoc. 2015;110(512):1726-1734. doi: 10.1080/01621459.2014.993081. Epub 2015 Jan 23.
9
Gene regulation network inference with joint sparse Gaussian graphical models.基于联合稀疏高斯图形模型的基因调控网络推断
J Comput Graph Stat. 2015 Oct 1;24(4):954-974. doi: 10.1080/10618600.2014.956876. Epub 2014 Sep 17.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验