Wytock Thomas P, Motter Adilson E
Department of Physics and Astronomy, Northwestern University, Evanston, Illinois 60208, USA.
Center for Network Dynamics, Northwestern University, Evanston, Illinois 60208, USA.
ArXiv. 2024 Mar 7:arXiv:2403.04837v1.
Recent developments in synthetic biology, next-generation sequencing, and machine learning provide an unprecedented opportunity to rationally design new disease treatments based on measured responses to gene perturbations and drugs to reprogram cell behavior. The main challenges to seizing this opportunity are the incomplete knowledge of the cellular network and the combinatorial explosion of possible interventions, both of which are insurmountable by experiments. To address these challenges, we develop a transfer learning approach to control cell behavior that is pre-trained on transcriptomic data associated with human cell fates to generate a model of the functional network dynamics that can be transferred to specific reprogramming goals. The approach additively combines transcriptional responses to gene perturbations (single-gene knockdowns and overexpressions) to minimize the transcriptional difference between a given pair of initial and target states. We demonstrate the flexibility of our approach by applying it to a microarray dataset comprising over 9,000 microarrays across 54 cell types and 227 unique perturbations, and an RNASeq dataset consisting of over 10,000 sequencing runs across 36 cell types and 138 perturbations. Our approach reproduces known reprogramming protocols with an average AUROC of 0.91 while innovating over existing methods by pre-training an adaptable model that can be tailored to specific reprogramming transitions. We show that the number of gene perturbations required to steer from one fate to another increases as the developmental relatedness decreases. We also show that fewer genes are needed to progress along developmental paths than to regress. Together, these findings establish a proof-of-concept for our approach to computationally design control strategies and demonstrate their ability to provide insights into the dynamics of gene regulatory networks.
合成生物学、下一代测序技术和机器学习的最新进展为基于对基因扰动和药物的测量反应来合理设计新的疾病治疗方法提供了前所未有的机会,从而对细胞行为进行重新编程。抓住这一机遇的主要挑战在于细胞网络知识的不完整以及可能干预措施的组合爆炸式增长,而这两者都是实验无法克服的。为应对这些挑战,我们开发了一种转移学习方法来控制细胞行为,该方法在与人类细胞命运相关的转录组数据上进行预训练,以生成一个功能网络动力学模型,该模型可转移到特定的重编程目标。该方法将对基因扰动(单基因敲低和过表达)的转录反应进行累加组合,以最小化给定的一对初始状态和目标状态之间的转录差异。我们通过将其应用于一个包含54种细胞类型和227种独特扰动的9000多个微阵列的微阵列数据集,以及一个由36种细胞类型和138种扰动的10000多次测序运行组成的RNA测序数据集,证明了我们方法的灵活性。我们的方法以平均0.91的曲线下面积(AUROC)重现了已知的重编程方案,同时通过预训练一个可适应特定重编程转变的模型,在现有方法的基础上进行了创新。我们表明,随着发育相关性的降低,从一种命运转向另一种命运所需的基因扰动数量会增加。我们还表明,沿着发育路径前进所需的基因比倒退所需的基因更少。总之,这些发现为我们通过计算设计控制策略的方法建立了概念验证,并证明了它们提供对基因调控网络动力学见解的能力。