Olsen Rikke Groth, Bjerrum Flemming, Andersen Annarita Ghosh, Konge Lars, Røder Andreas, Svendsen Morten Bo Søndergaard
Department of Urology, Copenhagen Prostate Cancer Center, Copenhagen University Hospital-Rigshospitalet, Copenhagen, Denmark.
Copenhagen Academy for Medical Education and Simulation (CAMES), Centre for HR & Education, the Capital Region of Denmark, Ryesgade 53B, 2100, Copenhagen, Denmark.
J Robot Surg. 2025 Jul 18;19(1):404. doi: 10.1007/s11701-025-02556-2.
Surgical gestures analysis is a promising method to assess surgical procedure quality, but manual annotation is time-consuming. We aimed to develop a recurrent neural network for automated surgical gesture annotation using simulated robot-assisted radical prostatectomies. We have previously manually annotated 161 videos with five different surgical gestures (Regular dissection, Hemostatic control, Clip application, Needle handling, and Suturing). We created a model consisting of two neural networks: a pre-trained feature extractor (VisionTransformer using Imagenet) and a classification head (recurrent neural network with a Long Short-Term Memory (LSTM(128) and fully connected layer)). The data set was split into a training + validation set and a test set. The trained model labeled input sequences with one of the five surgical gestures. The overall performance of the neural networks was assessed by metrics for multi-label classification and defined Total Agreement, an extended version of Intersection over Union (IoU). Our neural network could predict the class of surgical gestures with an Area Under the Curve (AUC) of 0.95 (95% CI 0.93-0.96) and an F1-score of 0.71 (95% CI 0.67-0.75). The network could classify each surgical gesture with high accuracies (0.84-0.97) and high specificities (0.90-0.99), but with lower sensitivities (0.62-0.81). The average Total Agreement for each gesture class was between 0.72 (95% CI ± 0.03) and 0.91 (95% CI ± 0.02). We successfully developed a high-performing neural network to analyze gestures in simulated surgical procedures. Our next step is to use the network to annotate videos and evaluate their efficacy in predicting patient outcomes.
手术手势分析是评估手术过程质量的一种很有前景的方法,但人工标注耗时费力。我们旨在开发一种循环神经网络,用于使用模拟机器人辅助根治性前列腺切除术进行自动手术手势标注。我们之前已对161个视频进行了人工标注,这些视频包含五种不同的手术手势(常规解剖、止血控制、夹子应用、针线操作和缝合)。我们创建了一个由两个神经网络组成的模型:一个预训练的特征提取器(使用ImageNet的视觉变换器)和一个分类头(带有长短期记忆(LSTM(128))和全连接层的循环神经网络)。数据集被分为训练 + 验证集和测试集。训练好的模型用五种手术手势之一标记输入序列。神经网络的整体性能通过多标签分类指标进行评估,并定义了总体一致性,即交并比(IoU)的扩展版本。我们的神经网络能够以0.95(95%置信区间0.93 - 0.96)的曲线下面积(AUC)和0.71(95%置信区间0.67 - 0.75)的F1分数预测手术手势类别。该网络能够以较高的准确率(0.84 - 0.97)和较高的特异性(0.90 - 0.99)对每种手术手势进行分类,但敏感性较低(0.62 - 0.81)。每个手势类别的平均总体一致性在0.72(95%置信区间±0.03)和0.91(95%置信区间±0.02)之间。我们成功开发了一个高性能的神经网络来分析模拟手术过程中的手势。我们的下一步是使用该网络对标注视频,并评估其在预测患者预后方面的功效。