Sotudeh Hadi
Social Networks Lab, Department of Humanities, Social and Political Sciences, ETH Zürich, Zürich, Switzerland.
Front Sports Act Living. 2025 Feb 5;6:1512386. doi: 10.3389/fspor.2024.1512386. eCollection 2024.
This paper reviews the principles employed to identify team tactical formations in association football, covering over two decades of research based on event and tracking data. It first defines formations and discusses their history and importance. It then introduces the preprocessing and team/position-level principles. Preprocessing includes match segments and normalized locations followed by data representation using various options, such as average locations, hand-engineered features, and graphs for the team-level and relative locations, distributions, and images for the position-level approaches. Either of them is later followed by applying templates or clustering. Among the limitations for future research to address is the reliance on spatial rather than temporal aggregation, which bases formation identification on newly introduced coordinates that may not be available in raw tracking data. Assuming a fixed number of outfield players (e.g., 10) fails to address scenarios with fewer players due to red cards or injuries. Additionally, accounting for phases of play is crucial to provide more practical context and reduce noise by excluding irrelevant segments, such as set pieces. The existing formation templates do not support arrangments with more or fewer players in each horizontal line (e.g., 6-3-1). On the other hand, clustering forces new observations to be described with previously learned clusters, preventing the possibility of discovering emerging formations. Lastly, alternative evaluation methods should have been explored more rigorously, in the absence of ground truth labels. Overall, this study identifies assumptions, consequences, and drawbacks associated with formation identification principles to structure the body of knowledge and establish a foundation for the future.
本文回顾了用于识别足球比赛中球队战术阵型的原则,涵盖了基于事件和跟踪数据的二十多年研究。它首先定义了阵型,并讨论了它们的历史和重要性。然后介绍了预处理以及球队/位置层面的原则。预处理包括比赛片段和归一化位置,随后使用各种选项进行数据表示,例如球队层面的平均位置、手工设计的特征和图表,以及位置层面方法的相对位置、分布和图像。之后,它们中的任何一种都通过应用模板或聚类来跟进。未来研究需要解决的局限性包括依赖空间而非时间聚合,即阵型识别基于原始跟踪数据中可能不存在的新引入坐标。假设固定数量的场上球员(例如10名)无法应对因红牌或受伤导致球员数量较少的情况。此外,考虑比赛阶段对于提供更实际的背景并通过排除无关片段(如定位球)来减少噪声至关重要。现有的阵型模板不支持每条水平线上球员数量更多或更少的排列(例如6-3-1)。另一方面,聚类迫使新的观察结果用先前学习的聚类来描述,从而排除了发现新出现阵型的可能性。最后,在没有地面真相标签的情况下,应该更严格地探索替代评估方法。总体而言,本研究确定了与阵型识别原则相关的假设、后果和缺点,以构建知识体系并为未来奠定基础。