IEEE/ACM Trans Comput Biol Bioinform. 2017 Nov-Dec;14(6):1214-1227. doi: 10.1109/TCBB.2016.2595577. Epub 2016 Jul 28.
External control of gene regulatory networks (GRNs) has received much attention in recent years. The aim is to find a series of actions to apply to a gene regulation system making it avoid its diseased states. In this work, we propose a novel method for controlling partially observable GRNs combining batch mode reinforcement learning (Batch RL) and TD() algorithms. Unlike the existing studies inferring a computational model from gene expression data, and obtaining a control policy over the constructed model, our idea is to interpret the time series gene expression data as a sequence of observations that the system produced, and obtain an approximate stochastic policy directly from the gene expression data without estimation of the internal states of the partially observable environment. Thereby, we get rid of the most time consuming phases of the existing studies, inferring a model and running the model for the control. Results show that our method is able to provide control solutions for regulation systems of several thousands of genes only in seconds, whereas existing studies cannot solve control problems of even a few dozens of genes. Results also show that our approximate stochastic policies are almost as good as the policies generated by the existing studies.
近年来,基因调控网络(GRNs)的外部控制受到了广泛关注。其目的是找到一系列作用于基因调节系统的操作,使系统避免出现疾病状态。在这项工作中,我们提出了一种新的方法,将批量强化学习(Batch RL)和 TD()算法结合起来,用于控制部分可观测的 GRNs。与现有的从基因表达数据推断计算模型并获得对构建模型的控制策略的研究不同,我们的想法是将基因表达数据解释为系统产生的一系列观测结果,并直接从基因表达数据中获得近似随机策略,而无需对部分可观测环境的内部状态进行估计。因此,我们摆脱了现有研究中最耗时的阶段,即推断模型和运行模型进行控制。结果表明,我们的方法能够在几秒钟内为数千个基因的调节系统提供控制解决方案,而现有的研究甚至无法解决几十个基因的控制问题。结果还表明,我们的近似随机策略几乎与现有研究生成的策略一样好。