Vrotsou Katerina, Nordman Aida
IEEE Trans Vis Comput Graph. 2018 Jun 18. doi: 10.1109/TVCG.2018.2848247.
Sequential pattern mining finds applications in numerous diverging fields. Due to the problem's combinatorial nature, two main challenges arise. First, existing algorithms output large numbers of patterns many of which are uninteresting from a user's perspective. Second, as datasets grow, mining large number of patterns gets computationally expensive. There is, thus, a need for mining approaches that make it possible to focus the pattern search towards directions of interest. This work tackles this problem by combining interactive visualization with sequential pattern mining in order to create a "transparent box" execution model. We propose a novel approach to interactive visual sequence mining that allows the user to guide the execution of a pattern-growth algorithm at suitable points through a powerful visual interface. Our approach (1) introduces the possibility of using local constraints during the mining process, (2) allows stepwise visualization of patterns being mined, and (3) enables the user to steer the mining algorithm towards directions of interest. The use of local constraints significantly improves users' capability to progressively refine the search space without the need to restart computations. We exemplify our approach using two event sequence datasets; one composed of web page visits and another composed of individuals' activity sequences.
序列模式挖掘在众多不同领域都有应用。由于该问题的组合性质,出现了两个主要挑战。首先,现有算法会输出大量模式,其中许多从用户角度来看并无意义。其次,随着数据集的增长,挖掘大量模式的计算成本会变得很高。因此,需要有挖掘方法,使模式搜索能够朝着感兴趣的方向进行。这项工作通过将交互式可视化与序列模式挖掘相结合来解决这个问题,以创建一个“透明盒”执行模型。我们提出了一种新颖的交互式可视序列挖掘方法,该方法允许用户通过强大的可视化界面在合适的点指导模式增长算法的执行。我们的方法(1)引入了在挖掘过程中使用局部约束的可能性,(2)允许对正在挖掘的模式进行逐步可视化,并且(3)使用户能够将挖掘算法导向感兴趣的方向。局部约束的使用显著提高了用户逐步优化搜索空间的能力,而无需重新启动计算。我们使用两个事件序列数据集对我们的方法进行了示例说明;一个由网页访问组成,另一个由个人活动序列组成。