Taori Sakshi, Lim Sol
Department of Industrial and Systems Engineering, Virginia Polytechnic Institute and State University, 1145 Perry Street, Blacksburg, VA, USA.
Appl Ergon. 2025 May;125:104427. doi: 10.1016/j.apergo.2024.104427. Epub 2024 Dec 10.
The performance of machine learning (ML) algorithms is dependent on which dataset it has been trained on. While ML algorithms are increasingly used for lift risk assessment, many algorithms are often trained and tested on controlled simulation datasets, lacking the diversity of the lifting conditions. Consequently, concerns arise regarding their applicability in real-world scenarios characterized by substantial variations in lifting scenarios and postures. Our study investigates the impact of different lifting scenarios on the performance of ML algorithms trained on surface electromyography (sEMG) armband sensor data to classify hand-load levels (2.3 and 6.8 kg). Twelve healthy participants (6 male and 6 female) performed repetitive lifting tasks employing various lifting scenarios, including symmetric (S), asymmetric (A), and free-dynamic (F) techniques. Separate algorithms were developed using diverse training datasets (S, A, S+A, and F), ML classifiers, and sEMG features, and tested using the F dataset, representing unconstrained and naturalistic lifts. The mean accuracy and sensitivity were significantly lower in models trained on constrained (S) datasets compared to those trained on naturalistic lifts (F). The accuracy, precision, and sensitivity of models trained with frequency-domain sEMG features were greater than those trained with the time-domain features. In conclusion, ML algorithms trained on controlled symmetric lifts showed poor performance in predicting loads for dynamic, unconstrained lifts; thus, particular attention is needed when using such algorithms in real-world scenarios.
机器学习(ML)算法的性能取决于其训练所使用的数据集。虽然ML算法越来越多地用于提升风险评估,但许多算法通常在受控模拟数据集上进行训练和测试,缺乏提升条件的多样性。因此,对于它们在提升场景和姿势存在大量变化的现实世界场景中的适用性产生了担忧。我们的研究调查了不同提升场景对基于表面肌电图(sEMG)臂带传感器数据训练的ML算法性能的影响,以对手部负载水平(2.3千克和6.8千克)进行分类。12名健康参与者(6名男性和6名女性)采用各种提升场景执行重复提升任务,包括对称(S)、不对称(A)和自由动态(F)技术。使用不同的训练数据集(S、A、S+A和F)、ML分类器和sEMG特征开发了单独的算法,并使用F数据集进行测试,该数据集代表无约束和自然的提升。与在自然提升(F)上训练的模型相比,在受限(S)数据集上训练的模型的平均准确率和灵敏度显著更低。使用频域sEMG特征训练的模型的准确率、精确率和灵敏度高于使用时域特征训练的模型。总之,在受控对称提升上训练的ML算法在预测动态、无约束提升的负载方面表现不佳;因此,在现实世界场景中使用此类算法时需要特别注意。