Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA.
These authors contributed equally: Zeliha Kilic, Max Schweiger.
Nat Comput Sci. 2023 Feb;3(2):174-183. doi: 10.1038/s43588-022-00392-0. Epub 2023 Jan 19.
Gene expression models, which are key towards understanding cellular regulatory response, underlie observations of single-cell transcriptional dynamics. Although RNA expression data encode information on gene expression models, existing computational frameworks do not perform simultaneous Bayesian inference of gene expression models and parameters from such data. Rather, gene expression models-composed of gene states, their connectivities and associated parameters-are currently deduced by pre-specifying gene state numbers and connectivity before learning associated rate parameters. Here we propose a method to learn full distributions over gene states, state connectivities and associated rate parameters, simultaneously and self-consistently from single-molecule RNA counts. We propagate noise from fluctuating RNA counts over models by treating models themselves as random variables. We achieve this within a Bayesian non-parametric paradigm. We demonstrate our method on the pathway and the pathway, and verify its robustness on synthetic data.
基因表达模型是理解细胞调控反应的关键,它是单细胞转录动态观察的基础。尽管 RNA 表达数据编码了基因表达模型的信息,但现有的计算框架并没有从这些数据中同时进行基因表达模型和参数的贝叶斯推断。相反,基因表达模型——由基因状态、它们的连接和相关参数组成——目前是通过在学习相关速率参数之前预先指定基因状态数量和连接来推断的。在这里,我们提出了一种从单分子 RNA 计数中同时、一致地学习基因状态、状态连接和相关速率参数的全分布的方法。我们通过将模型本身视为随机变量,将来自波动 RNA 计数的噪声传播到模型中。我们在贝叶斯非参数范例中实现了这一点。我们在 途径和 途径上演示了我们的方法,并在合成数据上验证了其稳健性。