Antonini Valerio, Sheridan Dermot, Roantree Mark
School of Computing, Dublin City University, Dublin, Ireland.
Insight Centre for Data Analytics, Dublin, Ireland.
Data Brief. 2024 Oct 28;57:111082. doi: 10.1016/j.dib.2024.111082. eCollection 2024 Dec.
Research in field sports often measures the performance of players during competitive games with individual and time-based descriptive statistics. Data is generated using GPS technologies, capturing simple data such as time (seconds) and position (latitude and longitude). While the data capture is highly granular and in relatively high volumes, the raw data are unsuited to any form of analysis or machine learning functions. The dataset presented here is created through a data engineering process, driven by domain experts, to transform the GPS coordinates into a series of (player) actions. Using 14 outfield players from each of 11 games, we present a database comprising 12 variables and almost 160k actions. Its reuse potential is targeted at machine learning researchers, sport scientists and coaches who may have different requirements represented as different analytical queries. This dataset is dimensional in nature, facilitating a rich set of analytics across dimensions such as game, player, action type and duration.
田径运动研究通常使用基于个人和时间的描述性统计数据来衡量运动员在竞技比赛中的表现。数据通过GPS技术生成,记录诸如时间(秒)和位置(纬度和经度)等简单数据。虽然数据捕获粒度很高且数量相对较大,但原始数据并不适合任何形式的分析或机器学习功能。此处呈现的数据集是通过数据工程过程创建的,由领域专家驱动,将GPS坐标转换为一系列(运动员)动作。我们使用11场比赛中每场比赛的14名外场球员,展示了一个包含12个变量和近16万个动作数据库。其重用潜力针对的是机器学习研究人员、体育科学家和教练,他们可能有不同的需求,以不同的分析查询形式呈现。这个数据集本质上是多维的,便于在比赛、运动员、动作类型和持续时间等维度上进行丰富的分析。