Department of General, Visceral, and Transplantation Surgery, University of Heidelberg, Im Neuenheimer Feld 110, 69120, Heidelberg, Germany.
Department of Medical Biometry and Informatics, University of Heidelberg, Im Neuenheimer Feld 130.3, 69120, Heidelberg, Germany.
Surg Endosc. 2019 Nov;33(11):3732-3740. doi: 10.1007/s00464-019-06667-4. Epub 2019 Feb 21.
The most common way of assessing surgical performance is by expert raters to view a surgical task and rate a trainee's performance. However, there is huge potential for automated skill assessment and workflow analysis using modern technology. The aim of the present study was to evaluate machine learning (ML) algorithms using the data of a Myo armband as a sensor device for skills level assessment and phase detection in laparoscopic training.
Participants of three experience levels in laparoscopy performed a suturing and knot tying task on silicon models. Experts rated performance using Objective Structured Assessment of Surgical Skills (OSATS). Participants wore Myo armbands (Thalmic Labs™, Ontario, Canada) to record acceleration, angular velocity, orientation, and Euler orientation. ML algorithms (decision forest, neural networks, boosted decision tree) were compared for skill level assessment and phase detection.
28 participants (8 beginner, 10 intermediate, 10 expert) were included, and 99 knots were available for analysis. A neural network regression model had the lowest mean absolute error in predicting OSATS score (3.7 ± 0.6 points, r = 0.03 ± 0.81; OSATS min.-max.: 4-37 points). An ensemble of binary-class neural networks yielded the highest accuracy in predicting skill level (beginners: 82.2% correctly identified, intermediate: 3.0%, experts: 79.5%) whereas standard statistical analysis failed to discriminate between skill levels. Phase detection on raw data showed the best results with a multi-class decision jungle (average 16% correctly identified), but improved to 43% average accuracy with two-class boosted decision trees after Dynamic time warping (DTW) application.
Modern machine learning algorithms aid in interpreting complex surgical motion data, even when standard analysis fails. Dynamic time warping offers the potential to process and compare surgical motion data in order to allow automated surgical workflow detection. However, further research is needed to interpret and standardize available data and improve sensor accuracy.
评估手术表现最常见的方法是由专家评估者观看手术任务并对学员的表现进行评分。然而,使用现代技术进行自动技能评估和工作流程分析具有巨大的潜力。本研究的目的是评估机器学习(ML)算法,方法是使用 Myo 臂带(Thalmic Labs™,安大略省,加拿大)作为传感器设备来记录加速度、角速度、方向和欧拉方向的数据,评估腹腔镜训练中的技能水平和阶段检测。
经验水平不同的 3 名腹腔镜手术参与者在硅模型上进行缝合和打结任务。专家使用客观结构化手术技能评估(OSATS)进行表现评估。参与者佩戴 Myo 臂带(Thalmic Labs™,安大略省,加拿大)以记录加速度、角速度、方向和欧拉方向。比较了决策森林、神经网络、增强决策树等 ML 算法在技能水平评估和阶段检测中的性能。
共纳入 28 名参与者(8 名初学者、10 名中级、10 名专家),可分析 99 个结。神经网络回归模型在预测 OSATS 评分方面具有最低的平均绝对误差(3.7±0.6 分,r=0.03±0.81;OSATS 评分范围:4-37 分)。由二进制类神经网络组成的集成模型在预测技能水平方面具有最高的准确性(初学者:82.2%正确识别,中级:3.0%,专家:79.5%),而标准统计分析无法区分技能水平。原始数据的阶段检测结果最佳,多类决策丛林的平均准确率为 16%,但应用动态时间规整(DTW)后,两阶段增强决策树的准确率提高到 43%。
现代机器学习算法有助于解释复杂的手术运动数据,即使在标准分析失败的情况下也是如此。动态时间规整提供了处理和比较手术运动数据的潜力,以便实现自动手术工作流程检测。然而,需要进一步研究来解释和标准化可用数据并提高传感器的准确性。