Liu Shixia, Yin Jialun, Wang Xiting, Cui Weiwei, Cao Kelei, Pei Jian
IEEE Trans Vis Comput Graph. 2016 Nov;22(11):2451-66. doi: 10.1109/TVCG.2015.2509990. Epub 2015 Dec 17.
We present an online visual analytics approach to helping users explore and understand hierarchical topic evolution in high-volume text streams. The key idea behind this approach is to identify representative topics in incoming documents and align them with the existing representative topics that they immediately follow (in time). To this end, we learn a set of streaming tree cuts from topic trees based on user-selected focus nodes. A dynamic Bayesian network model has been developed to derive the tree cuts in the incoming topic trees to balance the fitness of each tree cut and the smoothness between adjacent tree cuts. By connecting the corresponding topics at different times, we are able to provide an overview of the evolving hierarchical topics. A sedimentation-based visualization has been designed to enable the interactive analysis of streaming text data from global patterns to local details. We evaluated our method on real-world datasets and the results are generally favorable.
我们提出了一种在线视觉分析方法,以帮助用户探索和理解大量文本流中的层次主题演变。该方法背后的关键思想是识别传入文档中的代表性主题,并将它们与紧接其后(按时间顺序)的现有代表性主题对齐。为此,我们基于用户选择的焦点节点从主题树中学习一组流树切割。已开发出一种动态贝叶斯网络模型,用于推导传入主题树中的树切割,以平衡每个树切割的适应性和相邻树切割之间的平滑性。通过连接不同时间的相应主题,我们能够提供不断演变的层次主题的概述。已设计了一种基于沉降的可视化方法,以实现从全局模式到局部细节的流文本数据的交互式分析。我们在真实世界数据集上评估了我们的方法,结果总体良好。