La Trobe Sport and Exercise Medicine Research Centre, College of Science, Health and Engineering, La Trobe University, Melbourne, AUSTRALIA.
Essendon Football Club, Melbourne, AUSTRALIA.
Med Sci Sports Exerc. 2018 Nov;50(11):2267-2276. doi: 10.1249/MSS.0000000000001685.
To evaluate common modeling strategies in training load and injury risk research when modeling continuous variables and interpreting continuous risk estimates; and present improved modeling strategies.
Workload data were pooled from Australian football (n = 2550) and soccer (n = 23,742) populations to create a representative sample of acute:chronic workload ratio observations for team sports. Injuries were simulated in the data using three predefined risk profiles (U-shaped, flat and S-shaped). One-hundred data sets were simulated with sample sizes of 1000 and 5000 observations. Discrete modeling methods were compared with continuous methods (spline regression and fractional polynomials) for their ability to fit the defined risk profiles. Models were evaluated using measures of discrimination (area under receiver operator characteristic [ROC] curve) and calibration (Brier score, logarithmic scoring).
Discrete models were inferior to continuous methods for fitting the true injury risk profiles in the data. Discrete methods had higher false discovery rates (16%-21%) than continuous methods (3%-7%). Evaluating models using the area under the ROC curve incorrectly identified discrete models as superior in over 30% of simulations. Brier and logarithmic scoring was more suited to assessing model performance with less than 6% discrete model selection rate.
Many studies on the relationship between training loads and injury that have used regression modeling have significant limitations due to improper discretization of continuous variables and risk estimates. Continuous methods are more suited to modeling the relationship between training load and injury. Comparing injury risk models using ROC curves can lead to inferior model selection. Measures of calibration are more informative judging the utility of injury risk models.
评估在训练负荷和损伤风险研究中建模连续变量和解释连续风险估计值时的常见建模策略,并提出改进的建模策略。
从澳大利亚足球(n=2550)和足球(n=23742)人群中汇集了工作量数据,以创建团队运动中急性:慢性工作量比观察值的代表性样本。使用三个预先定义的风险概况(U 形、平坦和 S 形)在数据中模拟损伤。使用 1000 和 5000 个观测值的 100 个数据集进行模拟。离散建模方法与连续方法(样条回归和分数多项式)比较了拟合定义风险概况的能力。使用判别能力(接收者操作特征曲线下的面积 [ROC])和校准(Brier 评分、对数评分)来评估模型。
离散模型在拟合数据中真实损伤风险概况方面不如连续方法。离散方法的假发现率(16%-21%)高于连续方法(3%-7%)。使用 ROC 曲线下的面积评估模型错误地认为离散模型在超过 30%的模拟中具有优势。Brier 和对数评分更适合评估少于 6%离散模型选择率的模型性能。
许多关于训练负荷与损伤之间关系的研究使用回归建模存在严重的局限性,因为连续变量和风险估计值的离散化不当。连续方法更适合建模训练负荷与损伤之间的关系。使用 ROC 曲线比较损伤风险模型可能导致较差的模型选择。校准措施更能判断损伤风险模型的实用性。