Villoutreix Paul, Andén Joakim, Lim Bomyi, Lu Hang, Kevrekidis Ioannis G, Singer Amit, Shvartsman Stanislav Y
Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey, United States of America.
Program in Applied and Computational Mathematics, Princeton University, Princeton, New Jersey, United States of America.
PLoS Comput Biol. 2017 Sep 18;13(9):e1005742. doi: 10.1371/journal.pcbi.1005742. eCollection 2017 Sep.
Dynamical processes in biology are studied using an ever-increasing number of techniques, each of which brings out unique features of the system. One of the current challenges is to develop systematic approaches for fusing heterogeneous datasets into an integrated view of multivariable dynamics. We demonstrate that heterogeneous data fusion can be successfully implemented within a semi-supervised learning framework that exploits the intrinsic geometry of high-dimensional datasets. We illustrate our approach using a dataset from studies of pattern formation in Drosophila. The result is a continuous trajectory that reveals the joint dynamics of gene expression, subcellular protein localization, protein phosphorylation, and tissue morphogenesis. Our approach can be readily adapted to other imaging modalities and forms a starting point for further steps of data analytics and modeling of biological dynamics.
人们使用越来越多的技术来研究生物学中的动态过程,每种技术都能揭示系统的独特特征。当前的挑战之一是开发系统方法,将异构数据集融合成多变量动态的综合视图。我们证明,异构数据融合可以在利用高维数据集内在几何结构的半监督学习框架内成功实现。我们使用来自果蝇模式形成研究的数据集来说明我们的方法。结果是一条连续轨迹,揭示了基因表达、亚细胞蛋白质定位、蛋白质磷酸化和组织形态发生的联合动态。我们的方法可以很容易地适用于其他成像模式,并为生物动力学的数据分析和建模的进一步步骤奠定了基础。