Department of Cell Biology, Harvard Medical School, Boston, MA, USA; cBio Center, Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA; Broad Institute, Cambridge, MA, USA.
Department of Cell Biology, Harvard Medical School, Boston, MA, USA; cBio Center, Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA; Broad Institute, Cambridge, MA, USA.
Cell Syst. 2021 Feb 17;12(2):128-140.e4. doi: 10.1016/j.cels.2020.11.013. Epub 2020 Dec 28.
Systematic perturbation of cells followed by comprehensive measurements of molecular and phenotypic responses provides informative data resources for constructing computational models of cell biology. Models that generalize well beyond training data can be used to identify combinatorial perturbations of potential therapeutic interest. Major challenges for machine learning on large biological datasets are to find global optima in a complex multidimensional space and mechanistically interpret the solutions. To address these challenges, we introduce a hybrid approach that combines explicit mathematical models of cell dynamics with a machine-learning framework, implemented in TensorFlow. We tested the modeling framework on a perturbation-response dataset of a melanoma cell line after drug treatments. The models can be efficiently trained to describe cellular behavior accurately. Even though completely data driven and independent of prior knowledge, the resulting de novo network models recapitulate some known interactions. The approach is readily applicable to various kinetic models of cell biology. A record of this paper's Transparent Peer Review process is included in the Supplemental Information.
系统地扰动细胞,然后全面测量分子和表型反应,为构建细胞生物学的计算模型提供了有价值的信息资源。能够很好地推广到训练数据之外的模型可用于识别具有潜在治疗意义的组合性扰动。在大型生物学数据集上进行机器学习的主要挑战是在复杂的多维空间中找到全局最优解,并对解决方案进行机械解释。为了解决这些挑战,我们引入了一种混合方法,将细胞动力学的显式数学模型与基于 TensorFlow 的机器学习框架结合起来。我们在经过药物处理的黑色素瘤细胞系的扰动-反应数据集上测试了建模框架。这些模型可以有效地进行训练,以准确描述细胞行为。尽管完全是数据驱动的,且不依赖于先验知识,但所得到的全新网络模型再现了一些已知的相互作用。该方法可方便地应用于各种细胞生物学的动力学模型。本文的透明同行评审过程记录包含在补充信息中。