一种从聚合生物基因表达数据中学习跳跃扩散过程的数据驱动方法。

Gao Jia-Xing, Wang Zhen-Yi, Zhang Michael Q, Qian Min-Ping, Jiang Da-Quan

LMAM, School of Mathematical Sciences, Peking University, Beijing 100871, China.

MOE Key Laboratory of Bioinformatics; Bioinformatics Division and Center for Synthetic and Systems Biology, BNRist; Department of Automation, Tsinghua University, Beijing 100084, China.

J Theor Biol. 2022 Jan 7;532:110923. doi: 10.1016/j.jtbi.2021.110923. Epub 2021 Oct 1.

Dynamic models of gene expression are urgently required. In this paper, we describe the time evolution of gene expression by learning a jump diffusion process to model the biological process directly. Our algorithm needs aggregate gene expression data as input and outputs the parameters of the jump diffusion process. The learned jump diffusion process can predict population distributions of gene expression at any developmental stage, obtain long-time trajectories for individual cells, and offer a novel approach to computing RNA velocity. Moreover, it studies biological systems from a stochastic dynamic perspective. Gene expression data at a time point, which is a snapshot of a cellular process, is treated as an empirical marginal distribution of a stochastic process. The Wasserstein distance between the empirical distribution and predicted distribution by the jump diffusion process is minimized to learn the dynamics. For the learned jump diffusion process, its trajectories correspond to the development process of cells, the stochasticity determines the heterogeneity of cells, its instantaneous rate of state change can be taken as "RNA velocity", and the changes in scales and orientations of clusters can be noticed too. We demonstrate that our method can recover the underlying nonlinear dynamics better compared to previous parametric models and the diffusion processes driven by Brownian motion for both synthetic and real world datasets. Our method is also robust to perturbations of data because the computation involves only population expectations.

迫切需要基因表达的动态模型。在本文中，我们通过学习一个跳跃扩散过程来直接对生物过程进行建模，从而描述基因表达的时间演化。我们的算法需要聚合基因表达数据作为输入，并输出跳跃扩散过程的参数。所学习到的跳跃扩散过程可以预测任何发育阶段基因表达的群体分布，获得单个细胞的长期轨迹，并提供一种计算RNA速度的新方法。此外，它从随机动态的角度研究生物系统。某一时刻的基因表达数据，即细胞过程的一个快照，被视为一个随机过程的经验边际分布。通过最小化经验分布与跳跃扩散过程预测分布之间的Wasserstein距离来学习动力学。对于所学习到的跳跃扩散过程，其轨迹对应于细胞的发育过程，随机性决定了细胞的异质性，其状态的瞬时变化率可被视为“RNA速度”，并且还可以注意到聚类的尺度和方向的变化。我们证明，与之前的参数模型以及由布朗运动驱动的扩散过程相比，对于合成数据集和真实世界数据集，我们的方法都能更好地恢复潜在的非线性动力学。我们的方法对数据扰动也具有鲁棒性，因为计算仅涉及群体期望。

相似文献

A data-driven method to learn a jump diffusion process from aggregate biological gene expression data.

J Theor Biol. 2022 Jan 7;532:110923. doi: 10.1016/j.jtbi.2021.110923. Epub 2021 Oct 1.

Macromolecular crowding: chemistry and physics meet biology (Ascona, Switzerland, 10-14 June 2012).

Phys Biol. 2013 Aug;10(4):040301. doi: 10.1088/1478-3975/10/4/040301. Epub 2013 Aug 2.

The Distance Between: An Algorithmic Approach to Comparing Stochastic Models to Time-Series Data.

Bull Math Biol. 2024 Jul 26;86(9):111. doi: 10.1007/s11538-024-01331-y.

Robust H Adaptive Fuzzy Tracking Control for MIMO Nonlinear Stochastic Poisson Jump Diffusion Systems.

IEEE Trans Cybern. 2019 Aug;49(8):3116-3130. doi: 10.1109/TCYB.2018.2839364. Epub 2018 Jun 8.

Stochastic modeling and simulation of reaction-diffusion system with Hill function dynamics.

BMC Syst Biol. 2017 Mar 14;11(Suppl 3):21. doi: 10.1186/s12918-017-0401-9.

Stochastic models for prodrug targeting. 1. Diffusion of the efflux drug.

Mol Pharm. 2006 Mar-Apr;3(2):187-95. doi: 10.1021/mp050089l.

A jump persistent turning walker to model zebrafish locomotion.

J R Soc Interface. 2015 Jan 6;12(102):20140884. doi: 10.1098/rsif.2014.0884.

Velocity-jump models with crowding effects.

Phys Rev E Stat Nonlin Soft Matter Phys. 2011 Dec;84(6 Pt 1):061920. doi: 10.1103/PhysRevE.84.061920. Epub 2011 Dec 28.

Persistent random walk of cells involving anomalous effects and random death.

Phys Rev E Stat Nonlin Soft Matter Phys. 2015 Apr;91(4):042124. doi: 10.1103/PhysRevE.91.042124. Epub 2015 Apr 20.

Heat shock response in CHO mammalian cells is controlled by a nonlinear stochastic process.

PLoS Comput Biol. 2007 Oct;3(10):1859-70. doi: 10.1371/journal.pcbi.0030187. Epub 2007 Aug 13.

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

相似文献

A data-driven method to learn a jump diffusion process from aggregate biological gene expression data.

J Theor Biol. 2022 Jan 7;532:110923. doi: 10.1016/j.jtbi.2021.110923. Epub 2021 Oct 1.

Macromolecular crowding: chemistry and physics meet biology (Ascona, Switzerland, 10-14 June 2012).

Phys Biol. 2013 Aug;10(4):040301. doi: 10.1088/1478-3975/10/4/040301. Epub 2013 Aug 2.

The Distance Between: An Algorithmic Approach to Comparing Stochastic Models to Time-Series Data.

Bull Math Biol. 2024 Jul 26;86(9):111. doi: 10.1007/s11538-024-01331-y.

Robust H Adaptive Fuzzy Tracking Control for MIMO Nonlinear Stochastic Poisson Jump Diffusion Systems.

IEEE Trans Cybern. 2019 Aug;49(8):3116-3130. doi: 10.1109/TCYB.2018.2839364. Epub 2018 Jun 8.

Stochastic modeling and simulation of reaction-diffusion system with Hill function dynamics.

BMC Syst Biol. 2017 Mar 14;11(Suppl 3):21. doi: 10.1186/s12918-017-0401-9.

Stochastic models for prodrug targeting. 1. Diffusion of the efflux drug.

Mol Pharm. 2006 Mar-Apr;3(2):187-95. doi: 10.1021/mp050089l.

A jump persistent turning walker to model zebrafish locomotion.

J R Soc Interface. 2015 Jan 6;12(102):20140884. doi: 10.1098/rsif.2014.0884.

Velocity-jump models with crowding effects.

Phys Rev E Stat Nonlin Soft Matter Phys. 2011 Dec;84(6 Pt 1):061920. doi: 10.1103/PhysRevE.84.061920. Epub 2011 Dec 28.

Persistent random walk of cells involving anomalous effects and random death.

Phys Rev E Stat Nonlin Soft Matter Phys. 2015 Apr;91(4):042124. doi: 10.1103/PhysRevE.91.042124. Epub 2015 Apr 20.

Heat shock response in CHO mammalian cells is controlled by a nonlinear stochastic process.

PLoS Comput Biol. 2007 Oct;3(10):1859-70. doi: 10.1371/journal.pcbi.0030187. Epub 2007 Aug 13.

A data-driven method to learn a jump diffusion process from aggregate biological gene expression data.

作者信息

机构信息

出版信息

相似文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献