Manjunath Mohith, Zhang Yi, Yeo Steve H, Sobh Omar, Russell Nathan, Followell Christian, Bushell Colleen, Ravaioli Umberto, Song Jun S
Department of Bioengineering, University of Illinois at Urbana-Champaign,Urbana, IL 61801, USA.
Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign,Urbana, IL 61801, USA.
PeerJ Comput Sci. 2018;4. doi: 10.7717/peerj-cs.155. Epub 2018 May 21.
Clustering is one of the most common techniques used in data analysis to discover hidden structures by grouping together data points that are similar in some measure into clusters. Although there are many programs available for performing clustering, a single web resource that provides both state-of-the-art clustering methods and interactive visualizations is lacking. ClusterEnG (acronym for Clustering Engine for Genomics) provides an interface for clustering big data and interactive visualizations including 3D views, cluster selection and zoom features. ClusterEnG also aims at educating the user about the similarities and differences between various clustering algorithms and provides clustering tutorials that demonstrate potential pitfalls of each algorithm. The web resource will be particularly useful to scientists who are not conversant with computing but want to understand the structure of their data in an intuitive manner.
ClusterEnG is part of a bigger project called KnowEnG (Knowledge Engine for Genomics) and is available at http://education.knoweng.org/clustereng.
聚类是数据分析中最常用的技术之一,通过将在某种程度上相似的数据点分组到集群中来发现隐藏结构。尽管有许多程序可用于执行聚类,但缺乏一个提供最新聚类方法和交互式可视化的单一网络资源。ClusterEnG(基因组学聚类引擎的首字母缩写)提供了一个用于大数据聚类和交互式可视化的界面,包括三维视图、聚类选择和缩放功能。ClusterEnG还旨在让用户了解各种聚类算法之间的异同,并提供聚类教程,展示每种算法可能存在的陷阱。该网络资源对不熟悉计算但希望以直观方式理解其数据结构的科学家特别有用。
ClusterEnG是一个名为KnowEnG(基因组学知识引擎)的更大项目的一部分,可在http://education.knoweng.org/clustereng上获取。