Suppr超能文献

scBoolSeq:将 scRNA-seq 统计与布尔动力学联系起来。

scBoolSeq: Linking scRNA-seq statistics and Boolean dynamics.

机构信息

Univ. Bordeaux, CNRS, Bordeaux INP, LaBRI, UMR 5800, Talence, France.

Institut Curie, Université PSL, Paris, France.

出版信息

PLoS Comput Biol. 2024 Jul 8;20(7):e1011620. doi: 10.1371/journal.pcbi.1011620. eCollection 2024 Jul.

Abstract

Boolean networks are largely employed to model the qualitative dynamics of cell fate processes by describing the change of binary activation states of genes and transcription factors with time. Being able to bridge such qualitative states with quantitative measurements of gene expression in cells, as scRNA-seq, is a cornerstone for data-driven model construction and validation. On one hand, scRNA-seq binarisation is a key step for inferring and validating Boolean models. On the other hand, the generation of synthetic scRNA-seq data from baseline Boolean models provides an important asset to benchmark inference methods. However, linking characteristics of scRNA-seq datasets, including dropout events, with Boolean states is a challenging task. We present scBoolSeq, a method for the bidirectional linking of scRNA-seq data and Boolean activation state of genes. Given a reference scRNA-seq dataset, scBoolSeq computes statistical criteria to classify the empirical gene pseudocount distributions as either unimodal, bimodal, or zero-inflated, and fit a probabilistic model of dropouts, with gene-dependent parameters. From these learnt distributions, scBoolSeq can perform both binarisation of scRNA-seq datasets, and generate synthetic scRNA-seq datasets from Boolean traces, as issued from Boolean networks, using biased sampling and dropout simulation. We present a case study demonstrating the application of scBoolSeq's binarisation scheme in data-driven model inference. Furthermore, we compare synthetic scRNA-seq data generated by scBoolSeq with BoolODE's, data for the same Boolean Network model. The comparison shows that our method better reproduces the statistics of real scRNA-seq datasets, such as the mean-variance and mean-dropout relationships while exhibiting clearly defined trajectories in two-dimensional projections of the data.

摘要

布尔网络主要用于通过描述基因和转录因子随时间的二进制激活状态的变化来模拟细胞命运过程的定性动力学。能够将这些定性状态与细胞中基因表达的定量测量(如 scRNA-seq)联系起来,是数据驱动模型构建和验证的基石。一方面,scRNA-seq 二值化是推断和验证布尔模型的关键步骤。另一方面,从基线布尔模型生成合成 scRNA-seq 数据为推断方法的基准提供了重要资产。然而,将 scRNA-seq 数据集的特征(包括缺失事件)与布尔状态联系起来是一项具有挑战性的任务。我们提出了 scBoolSeq,这是一种将 scRNA-seq 数据和基因的布尔激活状态进行双向链接的方法。给定参考 scRNA-seq 数据集,scBoolSeq 计算统计标准,将经验基因伪计数分布分类为单峰、双峰或零膨胀,并拟合具有基因依赖性参数的缺失概率模型。从这些学习到的分布中,scBoolSeq 可以对 scRNA-seq 数据集进行二值化,并且可以从布尔轨迹(如布尔网络生成的)生成合成 scRNA-seq 数据集,使用有偏采样和缺失模拟。我们提出了一个案例研究,展示了 scBoolSeq 的二值化方案在数据驱动模型推断中的应用。此外,我们比较了由 scBoolSeq 生成的与 BoolODE 的合成 scRNA-seq 数据,这些数据来自相同的布尔网络模型。比较表明,我们的方法更好地再现了真实 scRNA-seq 数据集的统计信息,例如均值方差和均值缺失关系,同时在数据的二维投影中表现出清晰定义的轨迹。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1561/11257695/3b85a5f7bf9f/pcbi.1011620.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验