National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America.
PLoS One. 2019 Aug 2;14(8):e0220742. doi: 10.1371/journal.pone.0220742. eCollection 2019.
Reprogramming of somatic cells to induced pluripotent stem cells, by overexpressing certain factors referred to as the reprogramming factors, can revolutionize regenerative medicine. To provide a coherent description of induced pluripotency from the gene regulation perspective, we use 35 microarray datasets to construct a reprogramming gene regulatory network. Comprising 276 nodes and 4471 links, the resulting network is, to the best of our knowledge, the largest gene regulatory network constructed for human fibroblast reprogramming and it is the only one built using a large number of experimental datasets. To build the network, a model that relates the expression profiles of the initial (fibroblast) and final (induced pluripotent stem cell) states is proposed and the model parameters (link strengths) are fitted using the experimental data. Twenty nine additional experimental datasets are collectively used to test the model/network, and good agreement between experimental and predicted gene expression profiles is found. We show that the model in conjunction with the constructed network can make useful predictions. For example, we demonstrate that our approach can incorporate the effect of reprogramming factor stoichiometry and that its predictions are consistent with the experimentally observed trends in reprogramming efficiency when the stoichiometric ratios vary. Using our model/network, we also suggest new (not used in training of the model) candidate sets of reprogramming factors, many of which have already been experimentally verified. These results suggest our model/network can potentially be used in devising new recipes for induced pluripotency with higher efficiencies. Additionally, we classify the links of the network into three classes of different importance, prioritizing them for experimental verification. We show that many of the links in the top ranked class are experimentally known to be important in reprogramming. Finally, comparing with other methods, we show that using our model is advantageous.
体细胞重编程为诱导多能干细胞(induced pluripotent stem cells,iPSCs),通过过度表达某些被称为重编程因子的因子,可以彻底改变再生医学。为了从基因调控的角度提供对诱导多能性的连贯描述,我们使用 35 个微阵列数据集构建了一个重编程基因调控网络。该网络由 276 个节点和 4471 个链接组成,据我们所知,这是构建用于人类成纤维细胞重编程的最大基因调控网络,也是唯一使用大量实验数据集构建的网络。为了构建网络,我们提出了一种模型,该模型将初始(成纤维细胞)和最终(诱导多能干细胞)状态的表达谱相关联,并使用实验数据拟合模型参数(链接强度)。二十九个额外的实验数据集被集体用于测试模型/网络,并且在实验和预测的基因表达谱之间发现了很好的一致性。我们表明,该模型与构建的网络可以做出有用的预测。例如,我们证明了我们的方法可以整合重编程因子化学计量的影响,并且当化学计量比变化时,其预测与在重编程效率方面观察到的实验趋势一致。使用我们的模型/网络,我们还提出了新的(未用于模型训练)候选重编程因子集,其中许多已经在实验中得到验证。这些结果表明,我们的模型/网络有可能用于设计具有更高效率的诱导多能性的新方案。此外,我们将网络的链接分为三类不同的重要性,对其进行优先排序以进行实验验证。我们表明,排在前一类的许多链接在重编程中已经被实验证明是重要的。最后,与其他方法相比,我们表明使用我们的模型是有利的。