Surgical AI and Innovation Laboratory, Massachusetts General Hospital, 15 Parkman St., WAC 460, Boston, MA, 02114, USA.
Department of Surgery, Massachusetts General Hospital, Boston, MA, USA.
Surg Endosc. 2021 Jul;35(7):4008-4015. doi: 10.1007/s00464-020-07833-9. Epub 2020 Jul 27.
Artificial intelligence (AI) and computer vision (CV) have revolutionized image analysis. In surgery, CV applications have focused on surgical phase identification in laparoscopic videos. We proposed to apply CV techniques to identify phases in an endoscopic procedure, peroral endoscopic myotomy (POEM).
POEM videos were collected from Massachusetts General and Showa University Koto Toyosu Hospitals. Videos were labeled by surgeons with the following ground truth phases: (1) Submucosal injection, (2) Mucosotomy, (3) Submucosal tunnel, (4) Myotomy, and (5) Mucosotomy closure. The deep-learning CV model-Convolutional Neural Network (CNN) plus Long Short-Term Memory (LSTM)-was trained on 30 videos to create POEMNet. We then used POEMNet to identify operative phases in the remaining 20 videos. The model's performance was compared to surgeon annotated ground truth.
POEMNet's overall phase identification accuracy was 87.6% (95% CI 87.4-87.9%). When evaluated on a per-phase basis, the model performed well, with mean unweighted and prevalence-weighted F1 scores of 0.766 and 0.875, respectively. The model performed best with longer phases, with 70.6% accuracy for phases that had a duration under 5 min and 88.3% accuracy for longer phases.
A deep-learning-based approach to CV, previously successful in laparoscopic video phase identification, translates well to endoscopic procedures. With continued refinements, AI could contribute to intra-operative decision-support systems and post-operative risk prediction.
人工智能(AI)和计算机视觉(CV)彻底改变了图像分析。在外科领域,CV 应用主要集中在腹腔镜视频中的手术阶段识别。我们提出应用 CV 技术来识别经口内镜下肌切开术(POEM)中的内镜手术阶段。
从马萨诸塞州综合医院和昭和大学豊洲医院收集 POEM 视频。由外科医生对视频进行标注,标注内容包括以下真实阶段:(1)黏膜下注射,(2)黏膜切开,(3)黏膜下隧道,(4)肌切开,和(5)黏膜切开闭合。基于深度学习的 CV 模型-卷积神经网络(CNN)加长短时记忆(LSTM)-在 30 个视频上进行训练,以创建 POEMNet。然后,我们使用 POEMNet 来识别其余 20 个视频中的手术阶段。将模型的性能与外科医生标注的真实情况进行比较。
POEMNet 的总体阶段识别准确率为 87.6%(95%CI 87.4-87.9%)。在逐个阶段评估时,该模型表现良好,未加权和流行加权 F1 分数分别为 0.766 和 0.875。对于持续时间小于 5 分钟的阶段,模型的准确率为 70.6%,对于持续时间较长的阶段,准确率为 88.3%,模型的表现最佳。
基于深度学习的 CV 方法以前在腹腔镜视频阶段识别中取得了成功,也很好地转化到了内镜手术中。随着进一步的改进,AI 可以为术中决策支持系统和术后风险预测做出贡献。