Hecksteden Anne, Schmartz Georges Pierre, Egyptien Yanni, Aus der Fünten Karen, Keller Andreas, Meyer Tim
Saarland University, Institute of Sports and Preventive Medicine, Saarbruecken, Germany.
Saarland University, Chair for Clinical Bioinformatics, Saarbruecken, Germany.
Sci Med Footb. 2023 Aug;7(3):214-228. doi: 10.1080/24733938.2022.2095006. Epub 2022 Jul 4.
Identifying players or circumstances associated with an increased risk of injury is fundamental for successful risk management in football. So far, time-constant and volatile risk factors are generally considered separately in either a screening (constant) or a monitoring (volatile) approach each resulting in a restricted set of explanatory variables. Consequently, improvements in predictive accuracy may be expected when screening and monitoring data are combined, especially when analysed with current machine learning (ML) techniques.This trial was designed as a prospective observational cohort study aiming to forecast non-contact time-loss injuries in male professional football (soccer). Injuries were registered according to the Fuller consensus. Gradient boosting with ROSE upsampling within a leave-one-out cross-validation was used for data analysis. The hierarchical data structure was considered throughout. Different splits of the original dataset were used to probe the robustness of results.Data of 88 players from 4 teams and 51 injuries could be analysed. The cross-validated performance of the gradient boosted model (ROC area under the curve 0.61) was promising and higher compared to models without integration of screening data. Importantly, holdout test set performance was similar (ROC area under the curve 0.62) indicating prospect of generalizability to new cases. However, the variation of predictive accuracy and feature importance with different splits of the original dataset reflects the relatively low number of events.It is concluded that ML-based injury forecasting based on the integration of screening and monitoring data is promising. However, external prospective verification and continued model development are required.
识别与受伤风险增加相关的球员或情况是足球成功风险管理的基础。到目前为止,时间恒定和易变的风险因素通常在筛查(恒定)或监测(易变)方法中分别考虑,每种方法都导致解释变量集受限。因此,当结合筛查和监测数据时,尤其是使用当前机器学习(ML)技术进行分析时,预测准确性有望提高。
本试验设计为一项前瞻性观察队列研究,旨在预测男性职业足球(英式足球)中的非接触性失能伤病。伤病按照富勒共识进行记录。在留一法交叉验证中使用带ROSE上采样的梯度提升进行数据分析。自始至终都考虑了分层数据结构。使用原始数据集的不同划分来探究结果的稳健性。
可以分析来自4支球队的88名球员的数据和51例伤病情况。梯度提升模型的交叉验证性能(曲线下ROC面积为0.61)很有前景,并且比未整合筛查数据的模型更高。重要的是,留出测试集的性能相似(曲线下ROC面积为0.62),表明对新病例具有可推广性。然而,预测准确性和特征重要性随原始数据集不同划分的变化反映了事件数量相对较少。
得出的结论是,基于筛查和监测数据整合的基于ML的伤病预测很有前景。然而,需要外部前瞻性验证和持续的模型开发。