IEEE/ACM Trans Comput Biol Bioinform. 2024 Sep-Oct;21(5):1504-1517. doi: 10.1109/TCBB.2024.3402220. Epub 2024 Oct 9.
The complexity, scale, and uncertainty in regulatory networks (e.g., gene regulatory networks and microbial networks) regularly pose a huge uncertainty in their models. These uncertainties often cannot be entirely reduced using limited and costly data acquired from the normal condition of systems. Meanwhile, regulatory networks often suffer from the non-identifiability issue, which refers to scenarios where the true underlying network model cannot be clearly distinguished from other possible models. Perturbation or excitation is a well-known process in systems biology for acquiring targeted data to reveal the complex underlying mechanisms of regulatory networks and overcome the non-identifiability issue. We consider a general class of Boolean network models for capturing the activation and inactivation of components and their complex interactions. Assuming partial available knowledge about the interactions between components of the networks, this paper formulates the inference process through the maximum aposteriori (MAP) criterion. We develop a Bayesian lookahead policy that systematically perturbs regulatory networks to maximize the performance of MAP inference under the perturbed data. This is achieved by optimally formulating the perturbation process in a reinforcement learning context and deriving a scalable deep reinforcement learning perturbation policy to compute near-optimal Bayesian policy. The proposed method learns the perturbation policy through planning without the need for any real data. The high performance of the proposed approach is demonstrated by comprehensive numerical experiments using the well-known mammalian cell cycle and gut microbial community networks.
调控网络(如基因调控网络和微生物网络)的复杂性、规模和不确定性经常给它们的模型带来巨大的不确定性。这些不确定性通常无法通过从系统正常状态获得的有限和昂贵的数据完全减少。同时,调控网络经常受到不可识别性问题的困扰,这是指真实的潜在网络模型无法与其他可能的模型明显区分的情况。扰动或激励是系统生物学中一种众所周知的获取目标数据的过程,用于揭示调控网络的复杂潜在机制并克服不可识别性问题。我们考虑了一类用于捕获组件的激活和失活及其复杂相互作用的布尔网络模型。假设对网络组件之间的相互作用有部分可用知识,本文通过最大后验(MAP)准则来制定推理过程。我们开发了一种贝叶斯前瞻策略,通过在受扰数据下优化 MAP 推理的性能,系统地扰动调控网络。这是通过在强化学习上下文中优化扰动过程并推导出可扩展的深度强化学习扰动策略来计算近最优的贝叶斯策略来实现的。所提出的方法通过规划学习扰动策略,而无需任何实际数据。通过使用著名的哺乳动物细胞周期和肠道微生物群落网络的综合数值实验,证明了所提出方法的高性能。