Brugnone Nathan, Gonopolskiy Alex, Moyle Mark W, Kuchroo Manik, van Dijk David, Moon Kevin R, Colon-Ramos Daniel, Wolf Guy, Hirn Matthew J, Krishnaswamy Smita
Dept. of Comp. Math., Sci. & Eng., Michigan State University, East Lansing, MI, USA.
PicnicHealth, Berlin, Germany.
Proc IEEE Int Conf Big Data. 2019 Dec;2019:2624-2633. doi: 10.1109/BigData47090.2019.9006013. Epub 2020 Feb 24.
Big data often has emergent structure that exists at multiple levels of abstraction, which are useful for characterizing complex interactions and dynamics of the observations. Here, we consider multiple levels of abstraction via a multiresolution geometry of data points at different granularities. To construct this geometry we define a time-inhomogemeous diffusion process that effectively condenses data points together to uncover nested groupings at larger and larger granularities. This inhomogeneous process creates a deep cascade of intrinsic low pass filters on the data affinity graph that are applied in sequence to gradually eliminate local variability while adjusting the learned data geometry to increasingly coarser resolutions. We provide visualizations to exhibit our method as a "continuously-hierarchical" clustering with directions of eliminated variation highlighted at each step. The utility of our algorithm is demonstrated via neuronal data condensation, where the constructed multiresolution data geometry uncovers the organization, grouping, and connectivity between neurons.
大数据通常具有出现在多个抽象层次的涌现结构,这些结构有助于刻画观测值的复杂相互作用和动态变化。在此,我们通过不同粒度的数据点的多分辨率几何来考虑多个抽象层次。为了构建这种几何结构,我们定义了一个时间非齐次扩散过程,该过程有效地将数据点凝聚在一起,以揭示越来越大粒度下的嵌套分组。这种非齐次过程在数据亲和图上创建了一个深度级联的固有低通滤波器,这些滤波器按顺序应用,以逐步消除局部变异性,同时将学习到的数据几何调整到越来越粗糙的分辨率。我们提供可视化展示,将我们的方法呈现为一种“连续分层”聚类,在每个步骤中突出显示消除变化的方向。我们的算法的效用通过神经元数据凝聚得到了证明,其中构建的多分辨率数据几何揭示了神经元之间的组织、分组和连接性。