Dowell Karen G, Simons Allen K, Bai Hao, Kell Braden, Wang Zack Z, Yun Kyuson, Hibbs Matthew A
The Jackson Laboratory, Bar Harbor, Maine, USA; Graduate School of Biomedical Sciences and Engineering, University of Maine, Orono, Maine, USA.
Stem Cells. 2014 May;32(5):1161-72. doi: 10.1002/stem.1612.
Embryonic stem cells (ESCs), characterized by their ability to both self-renew and differentiate into multiple cell lineages, are a powerful model for biomedical research and developmental biology. Human and mouse ESCs share many features, yet have distinctive aspects, including fundamental differences in the signaling pathways and cell cycle controls that support self-renewal. Here, we explore the molecular basis of human ESC self-renewal using Bayesian network machine learning to integrate cell-type-specific, high-throughput data for gene function discovery. We integrated high-throughput ESC data from 83 human studies (1.8 million data points collected under 1,100 conditions) and 62 mouse studies (2.4 million data points collected under 1,085 conditions) into separate human and mouse predictive networks focused on ESC self-renewal to analyze shared and distinct functional relationships among protein-coding gene orthologs. Computational evaluations show that these networks are highly accurate, literature validation confirms their biological relevance, and reverse transcriptase polymerase chain reaction (RT-PCR) validation supports our predictions. Our results reflect the importance of key regulatory genes known to be strongly associated with self-renewal and pluripotency in both species (e.g., POU5F1, SOX2, and NANOG), identify metabolic differences between species (e.g., threonine metabolism), clarify differences between human and mouse ESC developmental signaling pathways (e.g., leukemia inhibitory factor (LIF)-activated JAK/STAT in mouse; NODAL/ACTIVIN-A-activated fibroblast growth factor in human), and reveal many novel genes and pathways predicted to be functionally associated with self-renewal in each species. These interactive networks are available online at www.StemSight.org for stem cell researchers to develop new hypotheses, discover potential mechanisms involving sparsely annotated genes, and prioritize genes of interest for experimental validation.
胚胎干细胞(ESCs)具有自我更新和分化为多种细胞谱系的能力,是生物医学研究和发育生物学的有力模型。人类和小鼠胚胎干细胞有许多共同特征,但也有独特之处,包括支持自我更新的信号通路和细胞周期控制方面的根本差异。在这里,我们使用贝叶斯网络机器学习来整合细胞类型特异性的高通量数据以发现基因功能,从而探索人类胚胎干细胞自我更新的分子基础。我们将来自83项人类研究(在1100种条件下收集了约180万个数据点)和62项小鼠研究(在1085种条件下收集了约240万个数据点)的高通量胚胎干细胞数据整合到分别针对胚胎干细胞自我更新的人类和小鼠预测网络中,以分析蛋白质编码基因直系同源物之间共享和不同的功能关系。计算评估表明这些网络高度准确,文献验证证实了它们的生物学相关性,逆转录聚合酶链反应(RT-PCR)验证支持了我们的预测。我们的结果反映了已知在两个物种中都与自我更新和多能性密切相关的关键调控基因的重要性(例如,POU5F1、SOX2和NANOG),确定了物种之间的代谢差异(例如,苏氨酸代谢),阐明了人类和小鼠胚胎干细胞发育信号通路之间的差异(例如,小鼠中白血病抑制因子(LIF)激活的JAK/STAT;人类中NODAL/激活素A激活的成纤维细胞生长因子),并揭示了许多预计在每个物种中与自我更新功能相关的新基因和途径。这些交互式网络可在www.StemSight.org上在线获取,供干细胞研究人员提出新假设、发现涉及注释稀少基因的潜在机制,并对感兴趣的基因进行实验验证排序。