Department of Physics and Astronomy, University of Denver, Denver, Colorado.
Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York.
Biophys J. 2023 Jul 11;122(13):2623-2635. doi: 10.1016/j.bpj.2023.05.017. Epub 2023 May 22.
Gene expression is inherently noisy due to small numbers of proteins and nucleic acids inside a cell. Likewise, cell division is stochastic, particularly when tracking at the level of a single cell. The two can be coupled when gene expression affects the rate of cell division. Single-cell time-lapse experiments can measure both fluctuations by simultaneously recording protein levels inside a cell and its stochastic division. These information-rich noisy trajectory data sets can be harnessed to learn about the underlying molecular and cellular details that are often not known a priori. A critical question is: How can we infer a model given data where fluctuations at two levels-gene expression and cell division-are intricately convoluted? We show the principle of maximum caliber (MaxCal)-integrated within a Bayesian framework-can be used to infer several cellular and molecular details (division rates, protein production, and degradation rates) from these coupled stochastic trajectories (CSTs). We demonstrate this proof of concept using synthetic data generated from a known model. An additional challenge in data analysis is that trajectories are often not in protein numbers, but in noisy fluorescence that depends on protein number in a probabilistic manner. We again show that MaxCal can infer important molecular and cellular rates even when data are in fluorescence, another example of CST with three confounding factors-gene expression noise, cell division noise, and fluorescence distortion-all coupled. Our approach will provide guidance to build models in synthetic biology experiments as well as general biological systems where examples of CSTs are abundant.
由于细胞内蛋白质和核酸数量较少,基因表达本质上是嘈杂的。同样,细胞分裂具有随机性,特别是在单细胞水平进行跟踪时。当基因表达影响细胞分裂速度时,两者可以耦合。单细胞延时实验可以通过同时记录细胞内蛋白质水平及其随机分裂来测量这两种波动。这些信息丰富的嘈杂轨迹数据集可用于了解通常事先未知的潜在分子和细胞细节。一个关键问题是:如何在两个水平(基因表达和细胞分裂)的波动错综复杂的情况下,从数据中推断出模型?我们表明,可以在贝叶斯框架内使用最大口径(MaxCal)原理,从这些耦合随机轨迹(CST)中推断出几个细胞和分子细节(分裂率、蛋白质产生和降解率)。我们使用从已知模型生成的合成数据来证明这一概念。数据分析中的另一个挑战是轨迹通常不是以蛋白质数量表示,而是以依赖于蛋白质数量的嘈杂荧光表示。我们再次表明,即使数据以荧光表示,MaxCal 也可以推断出重要的分子和细胞速率,这是 CST 的另一个例子,其中有三个混杂因素-基因表达噪声、细胞分裂噪声和荧光失真-全部耦合。我们的方法将为在合成生物学实验中以及 CST 丰富的一般生物学系统中构建模型提供指导。