Alali Mohammad, Imani Mahdi
Department of Electrical and Computer Engineering at Northeastern University.
Proc Am Control Conf. 2023 May-Jun;2023:3957-3964. doi: 10.23919/acc55779.2023.10155867. Epub 2023 Jul 3.
Gene regulatory networks (GRNs) consist of multiple interacting genes whose activities govern various cellular processes. The limitations in genomics data and the complexity of the interactions between components often pose huge uncertainties in the models of these biological systems. Meanwhile, inferring/estimating the interactions between components of the GRNs using data acquired from the normal condition of these biological systems is a challenging or, in some cases, an impossible task. Perturbation is a well-known genomics approach that aims to excite targeted components to gather useful data from these systems. This paper models GRNs using the Boolean network with perturbation, where the network uncertainty appears in terms of unknown interactions between genes. Unlike the existing heuristics and greedy data-acquiring methods, this paper provides an optimal Bayesian formulation of the data-acquiring process in the reinforcement learning context, where the actions are perturbations, and the reward measures step-wise improvement in the inference accuracy. We develop a semi-gradient reinforcement learning method with function approximation for learning near-optimal data-acquiring policy. The obtained policy yields near-exact Bayesian optimality with respect to the entire uncertainty in the regulatory network model, and allows learning the policy offline through planning. We demonstrate the performance of the proposed framework using the well-known p53-Mdm2 negative feedback loop gene regulatory network.
基因调控网络(GRNs)由多个相互作用的基因组成,这些基因的活动控制着各种细胞过程。基因组学数据的局限性以及组件之间相互作用的复杂性,常常给这些生物系统的模型带来巨大的不确定性。同时,利用从这些生物系统的正常状态获取的数据来推断/估计基因调控网络组件之间的相互作用是一项具有挑战性的任务,在某些情况下甚至是不可能完成的任务。扰动是一种著名的基因组学方法,旨在激发目标组件以从这些系统中收集有用数据。本文使用带有扰动的布尔网络对基因调控网络进行建模,其中网络的不确定性表现为基因之间未知的相互作用。与现有的启发式和贪婪数据获取方法不同,本文在强化学习的背景下提供了一种数据获取过程的最优贝叶斯公式,其中动作是扰动,奖励衡量推理准确性的逐步提高。我们开发了一种具有函数逼近的半梯度强化学习方法,用于学习近似最优的数据获取策略。所获得的策略在调控网络模型的整个不确定性方面产生了近似精确的贝叶斯最优性,并允许通过规划离线学习该策略。我们使用著名的p53-Mdm2负反馈回路基因调控网络展示了所提出框架的性能。