Eikvil Line, Jenssen Tor-Kristian, Holden Marit
Norwegian Computing Center, P.O. Box 114 Blindern, NO-0314 Oslo, Norway.
PubGene AS, Sognsveien 70A, PO Box 37 Vinderen, 0319 Oslo, Norway.
J Biomed Inform. 2015 Jun;55:116-23. doi: 10.1016/j.jbi.2015.03.012. Epub 2015 Apr 11.
Document collections resulting from searches in the biomedical literature, for instance, in PubMed, are often so large that some organization of the returned information is necessary. Clustering is an efficient tool for organizing search results. To help the user to decide how to continue the search for relevant documents, the content of each cluster can be characterized by a set of representative keywords or cluster labels. As different users may have different interests, it can be desirable with solutions that make it possible to produce labels from a selection of different topical categories. We therefore introduce the concept of multi-focus cluster labeling to give users the possibility to get an overview of the contents through labels from multiple viewpoints. The concept for multi-focus cluster labeling has been established and has been demonstrated on three different document collections. We illustrate that multi-focus visualizations can give an overview of clusters along axes that general labels are not able to convey. The approach is generic and should be applicable to any biomedical (or other) domain with any selection of foci where appropriate focus vocabularies can be established. A user evaluation also indicates that such a multi-focus concept is useful.
例如,在生物医学文献(如PubMed)中进行搜索所得到的文献集通常非常庞大,因此有必要对返回的信息进行某种组织。聚类是组织搜索结果的一种有效工具。为了帮助用户决定如何继续搜索相关文献,每个聚类的内容可以用一组代表性关键词或聚类标签来表征。由于不同用户可能有不同的兴趣,因此希望有这样的解决方案,使其能够从不同的主题类别中选择来生成标签。因此,我们引入了多焦点聚类标注的概念,以便用户能够通过从多个视角生成的标签来了解内容概况。多焦点聚类标注的概念已经确立,并在三个不同的文献集上得到了验证。我们举例说明,多焦点可视化可以沿着通用标签无法传达的轴提供聚类的概况。该方法具有通用性,应该适用于任何能够建立适当焦点词汇表的生物医学(或其他)领域以及任何焦点选择。用户评估也表明这样的多焦点概念是有用的。