Baker Frazier N, Porollo Aleksey
Department of Electrical Engineering and Computing Systems, University of Cincinnati, Cincinnati, OH 45221, USA.
Center for Autoimmune Genomics and Etiology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA.
Data (Basel). 2018 Mar;3(1). doi: 10.3390/data3010004. Epub 2018 Jan 13.
Similarity and distance matrices are general data structures that describe reciprocal relationships between the objects within a given dataset. Commonly used methods for representation of these matrices include heatmaps, hierarchical trees, dimensionality reduction, and various types of networks. However, despite a well-developed foundation for the visualization of such representations, the challenge of creating an interactive view that would allow for quick data navigation and interpretation remains largely unaddressed. This problem becomes especially evident for large matrices with hundreds or thousands objects. In this work, we present a web-based platform for the interactive analysis of large (dis-)similarity matrices. It consists of four major interconnected and synchronized components: a zoomable heatmap, interactive hierarchical tree, scalable circular relationship diagram, and 3D multi-dimensional scaling (MDS) scatterplot. We demonstrate the use of the platform for the analysis of amino acid covariance data in proteins as part of our previously developed CoeViz tool. The web-platform enables quick and focused analysis of protein features, such as structural domains and functional sites.
相似性矩阵和距离矩阵是描述给定数据集中对象之间相互关系的通用数据结构。表示这些矩阵的常用方法包括热图、层次树、降维和各种类型的网络。然而,尽管此类表示的可视化基础已经很完善,但创建一个允许快速数据导航和解释的交互式视图的挑战在很大程度上仍未得到解决。对于包含数百或数千个对象的大型矩阵,这个问题变得尤为明显。在这项工作中,我们展示了一个用于对大型(非)相似性矩阵进行交互式分析的基于网络的平台。它由四个主要的相互连接且同步的组件组成:一个可缩放的热图、交互式层次树、可扩展的圆形关系图和三维多维缩放(MDS)散点图。作为我们之前开发的CoeViz工具的一部分,我们展示了该平台用于分析蛋白质中氨基酸协方差数据的用途。该网络平台能够快速且有针对性地分析蛋白质特征,例如结构域和功能位点。