Graduate School of Informatics, Nagoya University, Nagoya, Aichi, Japan.
RIKEN Center for Advanced Intelligence Project, Tokyo, Japan.
PLoS One. 2021 Sep 23;16(9):e0256329. doi: 10.1371/journal.pone.0256329. eCollection 2021.
Given a set of sequences comprised of time-ordered events, sequential pattern mining is useful to identify frequent subsequences from different sequences or within the same sequence. However, in sport, these techniques cannot determine the importance of particular patterns of play to good or bad outcomes, which is often of greater interest to coaches and performance analysts. In this study, we apply a recently proposed supervised sequential pattern mining algorithm called safe pattern pruning (SPP) to 490 labelled event sequences representing passages of play from one rugby team's matches in the 2018 Japan Top League season. We obtain patterns that are the most discriminative between scoring and non-scoring outcomes from both the team's and opposition teams' perspectives using SPP, and compare these with the most frequent patterns obtained with well-known unsupervised sequential pattern mining algorithms when applied to subsets of the original dataset, split on the label. From our obtained results, line breaks, successful line-outs, regained kicks in play, repeated phase-breakdown play, and failed exit plays by the opposition team were found to be the patterns that discriminated most between the team scoring and not scoring. Opposition team line breaks, errors made by the team, opposition team line-outs, and repeated phase-breakdown play by the opposition team were found to be the patterns that discriminated most between the opposition team scoring and not scoring. It was also found that, probably because of the supervised nature and pruning/safe-screening mechanisms of SPP, compared to the patterns obtained by the unsupervised methods, those obtained by SPP were more sophisticated in terms of containing a greater variety of events, and when interpreted, the SPP-obtained patterns would also be more useful for coaches and performance analysts.
给定一组由时间顺序事件组成的序列,序列模式挖掘可用于从不同序列或同一序列中识别频繁的子序列。然而,在体育运动中,这些技术无法确定特定比赛模式对良好或不良结果的重要性,这通常是教练和表现分析师更感兴趣的。在这项研究中,我们应用了一种最近提出的有监督序列模式挖掘算法,称为安全模式修剪(SPP),对代表一支橄榄球队在 2018 年日本顶级联赛赛季比赛中传球的 490 个标记事件序列进行了分析。我们使用 SPP 从球队和对手的角度获得了在得分和非得分结果之间最具区分性的模式,并将这些模式与应用于原始数据集子集(按标签拆分)时获得的最常见的无监督序列模式挖掘算法的最频繁模式进行了比较。从我们获得的结果中发现,线突破、成功的争边球、在比赛中重新获得踢球、重复阶段崩溃比赛以及对手队的失败出口比赛是区分球队得分和不得分的最主要模式。对手队的线突破、球队的失误、对手队的争边球以及对手队的重复阶段崩溃比赛是区分对手队得分和不得分的最主要模式。还发现,可能是因为 SPP 的有监督性质和修剪/安全筛选机制,与无监督方法获得的模式相比,SPP 获得的模式在包含更多种类的事件方面更加复杂,并且在解释时,SPP 获得的模式对教练和表现分析师也更有用。