Oliva-Lozano José M, Vidal Miguel, Yousefian Farzad, Cost Rick, Gabbett Tim J
United States Soccer Federation, Chicago, IL, United States.
CIDESD, Research Center in Sports Sciences, Health Sciences and Human Development, Department of Sport Sciences, University of Beira Interior, Covilhã, Portugal.
J Hum Kinet. 2025 May 29;98:169-182. doi: 10.5114/jhk/195563. eCollection 2025 Jul.
The aim of this study was to build an XGBoost model to predict the match outcome and analyze match-related technical, tactical and physical performance features that may influence the predicted outcome of the match. This is an observational study which follows a retrospective design. The FIFA post-match summary reports were downloaded at the end of the 2023 Women's World Cup and used to create a dataset which consisted of match-related technical, tactical and physical performance variables. Then, an XGBoost model was built to predict the match outcome and investigate which performance features might influence the predicted outcome of the match. The overall model achieved accuracy of 0.58 ± 0.05. Losses and wins had similar predictive accuracy (0.67 ± 0.06 and 0.67 ± 0.08, respectively), but the prediction of draws performed was significantly worse with accuracy of 0.32 ± 0.16. The top ten features for predicting wins were: (1) out to in actions by the opponent, (2) attempts at the goal, (3) in-behind actions, (4) interceptions by the opponent, (5) loose ball receptions, (6) sprinting per minute by the opponent, (7) offers received by the opponent, (8) in-front opponent, (9) interceptions, and (10) total distance per minute. The top ten features for predicting losses were: (1) attempts at the goal by the opponent, (2) interceptions, (3) out to in actions, (4) possessions interrupted, (5) loose ball receptions by the opponent, (6) in front movements, (7) distance covered by the opponent, (8) in-behind actions by the opponent, (9) total distance, and (10) sprinting per minute. In conclusion, using an XGBoost model, this is the first study to successfully predict the match outcome for wins and losses from the FIFA Women's World Cup, but also explain which features significantly influence the prediction. This study may serve as a guide for practitioners regarding the use and application of XGBoost models in high performance.
本研究的目的是构建一个XGBoost模型来预测比赛结果,并分析可能影响比赛预测结果的与比赛相关的技术、战术和身体表现特征。这是一项采用回顾性设计的观察性研究。在2023年女足世界杯结束时下载了国际足联赛后总结报告,并用于创建一个由与比赛相关的技术、战术和身体表现变量组成的数据集。然后,构建了一个XGBoost模型来预测比赛结果,并研究哪些表现特征可能影响比赛的预测结果。整体模型的准确率为0.58±0.05。输球和赢球的预测准确率相似(分别为0.67±0.06和0.67±0.08),但平局的预测表现明显更差,准确率为0.32±0.16。预测赢球的十大特征为:(1) 对手从外到内的动作,(2) 射门尝试,(3) 身后动作,(4) 对手的拦截,(5) 接球,(6) 对手每分钟冲刺次数,(7) 对手接到的传球,(8) 身前对手,(9) 拦截,(10) 每分钟总距离。预测输球的十大特征为:(1) 对手的射门尝试,(2) 拦截,(3) 从外到内的动作,(4) 控球被打断,(5) 对手接球,(6) 身前移动,(7) 对手覆盖的距离,(8) 对手的身后动作,(9) 总距离,(10) 每分钟冲刺次数。总之,本研究首次使用XGBoost模型成功预测了女足世界杯比赛的胜负结果,同时还解释了哪些特征对预测有显著影响。本研究可为从业者在高性能环境下使用和应用XGBoost模型提供指导。