Skuta Ctibor, Bartůněk Petr, Svozil Daniel
Laboratory of Informatics and Chemistry, Faculty of Chemical Technology, Institute of Chemical Technology Prague, Technická 5, CZ-166 28 Prague, Czech Republic ; CZ-OPENSCREEN, Institute of Molecular Genetics of the ASCR, v. v. i, Vídeňská 1083, CZ-142 20 Prague, Czech Republic.
CZ-OPENSCREEN, Institute of Molecular Genetics of the ASCR, v. v. i, Vídeňská 1083, CZ-142 20 Prague, Czech Republic.
J Cheminform. 2014 Sep 17;6(1):44. doi: 10.1186/s13321-014-0044-4. eCollection 2014 Dec.
Hierarchical clustering is an exploratory data analysis method that reveals the groups (clusters) of similar objects. The result of the hierarchical clustering is a tree structure called dendrogram that shows the arrangement of individual clusters. To investigate the row/column hierarchical cluster structure of a data matrix, a visualization tool called 'cluster heatmap' is commonly employed. In the cluster heatmap, the data matrix is displayed as a heatmap, a 2-dimensional array in which the colour of each element corresponds to its value. The rows/columns of the matrix are ordered such that similar rows/columns are near each other. The ordering is given by the dendrogram which is displayed on the side of the heatmap.
We developed (Interactive Cluster Heatmap Library), a highly interactive and lightweight library for cluster heatmap visualization and exploration. enables the user to select individual or clustered heatmap rows, to zoom in and out of clusters or to flexibly modify heatmap appearance. The cluster heatmap can be augmented with additional metadata displayed in a different colour scale. In addition, to further enhance the visualization, the cluster heatmap can be interconnected with external data sources or analysis tools. Data clustering and the preparation of the input file for is facilitated by the Python utility script .
The cluster heatmap is one of the most popular visualizations of large chemical and biomedical data sets originating, e.g., in high-throughput screening, genomics or transcriptomics experiments. The presented library is a client-side solution for cluster heatmap exploration. can be easily deployed into any modern web application and configured to cooperate with external tools and data sources. Though is primarily intended for the analysis of chemical or biological data, it is a versatile tool which application domain is not limited to the life sciences only.
层次聚类是一种探索性数据分析方法,可揭示相似对象的组(聚类)。层次聚类的结果是一种称为树状图的树结构,它显示了各个聚类的排列。为了研究数据矩阵的行/列层次聚类结构,通常会使用一种称为“聚类热图”的可视化工具。在聚类热图中,数据矩阵以热图的形式显示,即一个二维数组,其中每个元素的颜色与其值相对应。矩阵的行/列按顺序排列,以便相似的行/列彼此相邻。排序由显示在热图一侧的树状图给出。
我们开发了(交互式聚类热图库),这是一个用于聚类热图可视化和探索的高度交互式且轻量级的库。它使用户能够选择单个或聚类的热图行,放大或缩小聚类,或灵活修改热图外观。聚类热图可以用不同颜色标度显示的附加元数据进行增强。此外,为了进一步增强可视化效果,聚类热图可以与外部数据源或分析工具互连。Python实用工具脚本有助于数据聚类和为准备输入文件。
聚类热图是源自例如高通量筛选、基因组学或转录组学实验的大型化学和生物医学数据集最流行的可视化之一。所展示的库是用于聚类热图探索的客户端解决方案。可以轻松部署到任何现代Web应用程序中,并配置为与外部工具和数据源协作。虽然主要用于化学或生物数据的分析,但它是一种通用工具,其应用领域不仅限于生命科学。