Kuchroo Manik, Godavarthi Abhinav, Tong Alexander, Wolf Guy, Krishnaswamy Smita
Yale University, Dept. of Neuro., Mila - Quebec AI Institute.
Dept. of Genetics, Mila - Quebec AI Institute.
IEEE Int Workshop Mach Learn Signal Process. 2021 Oct;2021. doi: 10.1109/mlsp52302.2021.9596214. Epub 2021 Nov 15.
We propose a method called integrated diffusion for combining multimodal data, gathered via different sensors on the same system, to create a integrated data diffusion operator. As real world data suffers from both local and global noise, we introduce mechanisms to optimally calculate a diffusion operator that reflects the combined information in data by maintaining low frequency eigenvectors of each modality both globally and locally. We show the utility of this integrated operator in denoising and visualizing multimodal toy data as well as multi-omic data generated from blood cells, measuring both gene expression and chromatin accessibility. Our approach better visualizes the geometry of the integrated data and captures known cross-modality associations. More generally, integrated diffusion is broadly applicable to multimodal datasets generated by noisy sensors collected in a variety of fields.
我们提出了一种名为集成扩散的方法,用于组合通过同一系统上的不同传感器收集的多模态数据,以创建一个集成数据扩散算子。由于现实世界的数据同时存在局部和全局噪声,我们引入了一些机制来优化计算扩散算子,该算子通过在全局和局部层面上保留每种模态的低频特征向量来反映数据中的组合信息。我们展示了这种集成算子在去噪和可视化多模态玩具数据以及从血细胞生成的多组学数据(同时测量基因表达和染色质可及性)方面的效用。我们的方法能更好地可视化集成数据的几何结构,并捕捉已知的跨模态关联。更广泛地说,集成扩散广泛适用于由在各种领域收集的有噪声传感器生成的多模态数据集。