Jung Saemi, Kim Bogeum, Kim Yoon-Ji, Lee Eun-Soo, Kang Dongmug, Kim Youngki
Department of Occupational and Environmental Medicine, Pusan National University Yangsan Hospital, Republic of Korea.
School of Computer Science and Engineering, Pusan National University, Republic of Korea.
Saf Health Work. 2025 Mar;16(1):113-121. doi: 10.1016/j.shaw.2025.01.003. Epub 2025 Jan 20.
This study aimed to develop prediction models for the work-relatedness of shoulder diseases through machine learning algorithms.
The dataset comprised 7,270 cases of 8,302 individuals who applied for occupational diseases and received the final approval decision from the Korea Workers' Compensation and Welfare Service's Disease Evaluation Committee, which is related to shoulder musculoskeletal disorders between January 2020 and December 2021. In this study, demographic analysis and difference of approval rate by shoulder diseases were performed. Additionally, machine learning algorithms, including logistic regression, support vector machine, decision tree, random forest, and the XGBoost, were utilized to construct prediction models for work-relatedness assessment.
The performance of each model was evaluated. XGBoost showed an accuracy of 81.64% and an area under the curve of 0.73, and random forest showed an accuracy of 84.46% and an area under the curve of 0.73. Key factors influencing work-relatedness assessment were employment period, physical burden score, gender, and age.
The application of various machine learning techniques showed high performance score, representing that it would be helpful to reduce the differences in judgment between occupational environment medicine physicians.
本研究旨在通过机器学习算法开发肩部疾病与工作相关性的预测模型。
数据集包括2020年1月至2021年12月期间申请职业病并获得韩国工人赔偿和福利服务疾病评估委员会最终批准决定的8302名个体中的7270例,这些病例与肩部肌肉骨骼疾病有关。在本研究中,进行了人口统计学分析以及按肩部疾病划分的批准率差异分析。此外,利用包括逻辑回归、支持向量机、决策树、随机森林和XGBoost在内的机器学习算法构建用于工作相关性评估的预测模型。
对每个模型的性能进行了评估。XGBoost的准确率为81.64%,曲线下面积为0.73,随机森林的准确率为84.46%,曲线下面积为0.73。影响工作相关性评估的关键因素是就业期限、身体负担评分、性别和年龄。
各种机器学习技术的应用显示出较高的性能得分,表明这将有助于减少职业环境医学医生之间判断的差异。