Costa Ivan G, Roepcke Stefan, Hafemeister Christoph, Schliep Alexander
Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany.
Bioinformatics. 2008 Jul 1;24(13):i156-64. doi: 10.1093/bioinformatics/btn153.
The regulation of proliferation and differentiation of embryonic and adult stem cells into mature cells is central to developmental biology. Gene expression measured in distinguishable developmental stages helps to elucidate underlying molecular processes. In previous work we showed that functional gene modules, which act distinctly in the course of development, can be represented by a mixture of trees. In general, the similarities in the gene expression programs of cell populations reflect the similarities in the differentiation path.
We propose a novel model for gene expression profiles and an unsupervised learning method to estimate developmental similarity and infer differentiation pathways. We assess the performance of our model on simulated data and compare it with favorable results to related methods. We also infer differentiation pathways and predict functional modules in gene expression data of lymphoid development.
We demonstrate for the first time how, in principal, the incorporation of structural knowledge about the dependence structure helps to reveal differentiation pathways and potentially relevant functional gene modules from microarray datasets. Our method applies in any area of developmental biology where it is possible to obtain cells of distinguishable differentiation stages.
The implementation of our method (GPL license), data and additional results are available at http://algorithmics.molgen.mpg.de/Supplements/InfDif/.
Supplementary data is available at Bioinformatics online.
胚胎干细胞和成人干细胞增殖与分化为成熟细胞的调控是发育生物学的核心内容。在可区分的发育阶段测量基因表达有助于阐明潜在的分子过程。在之前的工作中,我们表明在发育过程中具有不同作用的功能基因模块可以用树的混合来表示。一般来说,细胞群体基因表达程序中的相似性反映了分化路径的相似性。
我们提出了一种新的基因表达谱模型和一种无监督学习方法,用于估计发育相似性并推断分化路径。我们在模拟数据上评估了模型的性能,并将其与相关方法的良好结果进行了比较。我们还在淋巴发育的基因表达数据中推断了分化路径并预测了功能模块。
我们首次证明,原则上,纳入关于依赖结构的结构知识有助于从微阵列数据集中揭示分化路径和潜在相关的功能基因模块。我们的方法适用于发育生物学中任何能够获得可区分分化阶段细胞的领域。
我们方法的实现(GPL许可)、数据和其他结果可在http://algorithmics.molgen.mpg.de/Supplements/InfDif/获取。
补充数据可在《生物信息学》在线获取。