Department of Neurosurgery, Keck School of Medicine, University of Southern California, Los Angeles, California, USA.
Department of Computer Science, Viterbi School of Engineering, University of Southern California, Los Angeles, California, USA.
Oper Neurosurg (Hagerstown). 2022 Sep 1;23(3):235-240. doi: 10.1227/ons.0000000000000274. Epub 2022 May 26.
Intraoperative tool movement data have been demonstrated to be clinically useful in quantifying surgical performance. However, collecting this information from intraoperative video requires laborious hand annotation. The ability to automatically annotate tools in surgical video would advance surgical data science by eliminating a time-intensive step in research.
To identify whether machine learning (ML) can automatically identify surgical instruments contained within neurosurgical video.
A ML model which automatically identifies surgical instruments in frame was developed and trained on multiple publicly available surgical video data sets with instrument location annotations. A total of 39 693 frames from 4 data sets were used (endoscopic endonasal surgery [EEA] [30 015 frames], cataract surgery [4670], laparoscopic cholecystectomy [2532], and microscope-assisted brain/spine tumor removal [2476]). A second model trained only on EEA video was also developed. Intraoperative EEA videos from YouTube were used for test data (3 videos, 1239 frames).
The YouTube data set contained 2169 total instruments. Mean average precision (mAP) for instrument detection on the YouTube data set was 0.74. The mAP for each individual video was 0.65, 0.74, and 0.89. The second model trained only on EEA video also had an overall mAP of 0.74 (0.62, 0.84, and 0.88 for individual videos). Development costs were $130 for manual video annotation and under $100 for computation.
Surgical instruments contained within endoscopic endonasal intraoperative video can be detected using a fully automated ML model. The addition of disparate surgical data sets did not improve model performance, although these data sets may improve generalizability of the model in other use cases.
术中工具运动数据已被证明在量化手术性能方面具有临床意义。然而,从术中视频中收集这些信息需要繁琐的手动标注。能够自动标注手术视频中的工具将通过消除研究中耗时的步骤来推进手术数据科学。
确定机器学习 (ML) 是否可以自动识别手术视频中包含的手术器械。
开发了一种自动识别框架内手术器械的 ML 模型,并在具有器械位置注释的多个公开手术视频数据集上进行了训练。使用了来自 4 个数据集的 39693 个框架(内镜经鼻手术 [EEA] [30015 个框架]、白内障手术 [4670]、腹腔镜胆囊切除术 [2532] 和显微镜辅助脑/脊柱肿瘤切除术 [2476])。还开发了仅在 EEA 视频上训练的第二个模型。来自 YouTube 的术中 EEA 视频用于测试数据(3 个视频,1239 个框架)。
YouTube 数据集包含 2169 个总器械。在 YouTube 数据集上进行仪器检测的平均平均精度 (mAP) 为 0.74。每个视频的 mAP 分别为 0.65、0.74 和 0.89。仅在 EEA 视频上训练的第二个模型的整体 mAP 也为 0.74(对于单个视频,分别为 0.62、0.84 和 0.88)。开发成本为手动视频注释 130 美元,计算成本低于 100 美元。
可以使用全自动 ML 模型检测内镜经鼻手术中的手术器械。添加不同的手术数据集并没有提高模型性能,尽管这些数据集可能会提高模型在其他用例中的通用性。