Microsoft Research , Cambridge , UK ; Department of Biochemistry, University of Cambridge , Cambridge , UK.
Department of Computer Science, University of Leicester , Leicester , UK.
Front Bioeng Biotechnol. 2014 Dec 19;2:75. doi: 10.3389/fbioe.2014.00075. eCollection 2014.
Over the last decade, executable models of biological behaviors have repeatedly provided new scientific discoveries, uncovered novel insights, and directed new experimental avenues. These models are computer programs whose execution mechanistically simulates aspects of the cell's behaviors. If the observed behavior of the program agrees with the observed biological behavior, then the program explains the phenomena. This approach has proven beneficial for gaining new biological insights and directing new experimental avenues. One advantage of this approach is that techniques for analysis of computer programs can be applied to the analysis of executable models. For example, one can confirm that a model agrees with experiments for all possible executions of the model (corresponding to all environmental conditions), even if there are a huge number of executions. Various formal methods have been adapted for this context, for example, model checking or symbolic analysis of state spaces. To avoid manual construction of executable models, one can apply synthesis, a method to produce programs automatically from high-level specifications. In the context of biological modeling, synthesis would correspond to extracting executable models from experimental data. We survey recent results about the usage of the techniques underlying synthesis of computer programs for the inference of biological models from experimental data. We describe synthesis of biological models from curated mutation experiment data, inferring network connectivity models from phosphoproteomic data, and synthesis of Boolean networks from gene expression data. While much work has been done on automated analysis of similar datasets using machine learning and artificial intelligence, using synthesis techniques provides new opportunities such as efficient computation of disambiguating experiments, as well as the ability to produce different kinds of models automatically from biological data.
在过去的十年中,可执行的生物行为模型反复提供了新的科学发现,揭示了新的见解,并为新的实验途径提供了指导。这些模型是计算机程序,其执行机制模拟了细胞行为的某些方面。如果程序的观察行为与观察到的生物行为一致,那么该程序就解释了现象。这种方法已被证明有助于获得新的生物学见解和指导新的实验途径。这种方法的一个优点是,可以将用于分析计算机程序的技术应用于可执行模型的分析。例如,可以确认模型与模型的所有可能执行(对应于所有环境条件)都一致,即使执行的次数很多。已经针对这种情况采用了各种形式化方法,例如,模型检查或状态空间的符号分析。为了避免手动构建可执行模型,可以应用综合方法,即根据高级规范自动生成程序的方法。在生物学建模的上下文中,综合将对应于从实验数据中提取可执行模型。我们调查了最近关于从实验数据推断生物模型的计算机程序综合技术的使用情况的结果。我们描述了从精心设计的突变实验数据中综合生物学模型、从磷酸化蛋白质组学数据中推断网络连接模型以及从基因表达数据中综合布尔网络的情况。虽然已经使用机器学习和人工智能对类似数据集的自动分析进行了大量工作,但使用综合技术提供了新的机会,例如有效计算消除歧义的实验,以及从生物数据自动生成不同类型模型的能力。