Department of Biology and Biological Engineering, Caltech, Pasadena, CA.
Center for Robotic Simulation & Education, Catherine & Joseph Aresty Department of Urology, USC Institute of Urology, University of Southern California, Los Angeles, CA.
Surgery. 2021 May;169(5):1240-1244. doi: 10.1016/j.surg.2020.08.016. Epub 2020 Sep 26.
Our previous work classified a taxonomy of suturing gestures during a vesicourethral anastomosis of robotic radical prostatectomy in association with tissue tears and patient outcomes. Herein, we train deep learning-based computer vision to automate the identification and classification of suturing gestures for needle driving attempts.
Using two independent raters, we manually annotated live suturing video clips to label timepoints and gestures. Identification (2,395 videos) and classification (511 videos) datasets were compiled to train computer vision models to produce 2- and 5-class label predictions, respectively. Networks were trained on inputs of raw red/blue/green pixels as well as optical flow for each frame. Each model was trained on 80/20 train/test splits.
In this study, all models were able to reliably predict either the presence of a gesture (identification, area under the curve: 0.88) as well as the type of gesture (classification, area under the curve: 0.87) at significantly above chance levels. For both gesture identification and classification datasets, we observed no effect of recurrent classification model choice (long short-term memory unit versus convolutional long short-term memory unit) on performance.
Our results demonstrate computer vision's ability to recognize features that not only can identify the action of suturing but also distinguish between different classifications of suturing gestures. This demonstrates the potential to utilize deep learning computer vision toward future automation of surgical skill assessment.
我们之前的工作对机器人根治性前列腺切除术的尿道膀胱吻合术中的缝合手势进行了分类,这些手势与组织撕裂和患者预后有关。在此,我们训练基于深度学习的计算机视觉,以实现对针驱动尝试的缝合手势的自动识别和分类。
我们使用两名独立的评估者对手动缝合视频片段进行注释,以标记时间点和手势。我们分别编译了识别(2395 个视频)和分类(511 个视频)数据集,以训练计算机视觉模型,分别生成 2 类和 5 类标签预测。网络的输入为每个帧的原始红绿蓝像素和光流。每个模型都在 80/20 的训练/测试分割上进行训练。
在这项研究中,所有模型都能够可靠地预测手势的存在(识别,曲线下面积:0.88)以及手势的类型(分类,曲线下面积:0.87),这明显高于随机水平。对于手势识别和分类数据集,我们观察到递归分类模型选择(长短期记忆单元与卷积长短期记忆单元)对性能没有影响。
我们的结果表明,计算机视觉有能力识别特征,这些特征不仅可以识别缝合动作,还可以区分不同类型的缝合手势。这表明了利用深度学习计算机视觉实现手术技能评估自动化的潜力。