Department of Industrial & Operations Engineering, University of Michigan, Ann Arbor, Michigan, USA.
Medical Scientist Training Program, University of Michigan Medical School, Ann Arbor, Michigan, USA.
J Am Med Inform Assoc. 2022 Oct 7;29(11):1931-1940. doi: 10.1093/jamia/ocac130.
Occupational injuries (OIs) cause an immense burden on the US population. Prediction models help focus resources on those at greatest risk of a delayed return to work (RTW). RTW depends on factors that develop over time; however, existing methods only utilize information collected at the time of injury. We investigate the performance benefits of dynamically estimating RTW, using longitudinal observations of diagnoses and treatments collected beyond the time of initial injury.
We characterize the difference in predictive performance between an approach that uses information collected at the time of initial injury (baseline model) and a proposed approach that uses longitudinal information collected over the course of the patient's recovery period (proposed model). To control the comparison, both models use the same deep learning architecture and differ only in the information used. We utilize a large longitudinal observation dataset of OI claims and compare the performance of the two approaches in terms of daily prediction of future work state (working vs not working). The performance of these two approaches was assessed in terms of the area under the receiver operator characteristic curve (AUROC) and expected calibration error (ECE).
After subsampling and applying inclusion criteria, our final dataset covered 294 103 OIs, which were split evenly between train, development, and test datasets (1/3, 1/3, 1/3). In terms of discriminative performance on the test dataset, the proposed model had an AUROC of 0.728 (90% confidence interval: 0.723, 0.734) versus the baseline's 0.591 (0.585, 0.598). The proposed model had an ECE of 0.004 (0.003, 0.005) versus the baseline's 0.016 (0.009, 0.018).
The longitudinal approach outperforms current practice and shows potential for leveraging observational data to dynamically update predictions of RTW in the setting of OI. This approach may enable physicians and workers' compensation programs to manage large populations of injured workers more effectively.
职业伤害(OIs)给美国人口带来了巨大负担。预测模型有助于将资源集中在那些最有可能延迟重返工作岗位(RTW)的人身上。RTW 取决于随着时间的推移而发展的因素;然而,现有的方法仅利用受伤时收集的信息。我们通过对初始损伤后收集的诊断和治疗的纵向观察,研究了动态估计 RTW 的性能优势。
我们描述了使用初始损伤时收集的信息(基线模型)和使用患者康复期间收集的纵向信息(提出的模型)的方法之间在预测性能上的差异。为了控制比较,两种模型都使用相同的深度学习架构,只是使用的信息不同。我们利用一个大型的职业伤害索赔纵向观察数据集,根据未来工作状态(工作与不工作)的日常预测,比较两种方法的性能。这两种方法的性能是根据接收者操作特征曲线下的面积(AUROC)和预期校准误差(ECE)来评估的。
经过抽样和应用纳入标准后,我们的最终数据集涵盖了 294030 例 OIs,它们在训练、开发和测试数据集之间平均分配(1/3、1/3、1/3)。在测试数据集上的判别性能方面,提出的模型的 AUROC 为 0.728(90%置信区间:0.723,0.734),而基线的为 0.591(0.585,0.598)。提出的模型的 ECE 为 0.004(0.003,0.005),而基线的为 0.016(0.009,0.018)。
纵向方法优于当前的实践,并显示出利用观察数据在 OI 环境中动态更新 RTW 预测的潜力。这种方法可以使医生和工人补偿计划更有效地管理大量受伤工人。