Department of Neurosurgery, Keck School of Medicine, University of Southern California, Los Angeles, California, USA.
Department of Neurosurgery, Keck School of Medicine, University of Southern California, Los Angeles, California, USA.
World Neurosurg. 2021 Jun;150:26-30. doi: 10.1016/j.wneu.2021.03.022. Epub 2021 Mar 17.
Computer vision (CV) is a subset of artificial intelligence that performs computations on image or video data, permitting the quantitative analysis of visual information. Common CV tasks that may be relevant to surgeons include image classification, object detection and tracking, and extraction of higher order features. Despite the potential applications of CV to intraoperative video, however, few surgeons describe the use of CV. A primary roadblock in implementing CV is the lack of a clear workflow to create an intraoperative video dataset to which CV can be applied. We report general principles for creating usable surgical video datasets and the result of their applications.
Video annotations from cadaveric endoscopic endonasal skull base simulations (n = 20 trials of 1-5 minutes, size = 8 GB) were reviewed by 2 researcher-annotators. An internal, retrospective analysis of workflow for development of the intraoperative video annotations was performed to identify guiding practices.
Approximately 34,000 frames of surgical video were annotated. Key considerations in developing annotation workflows include 1) overcoming software and personnel constraints; 2) ensuring adequate storage and access infrastructure; 3) optimization and standardization of annotation protocol; and 4) operationalizing annotated data. Potential tools for use include CVAT (Computer Vision Annotation Tool) and Vott: open-sourced annotation software allowing for local video storage, easy setup, and the use of interpolation.
CV techniques can be applied to surgical video, but challenges for novice users may limit adoption. We outline principles in annotation workflow that can mitigate initial challenges groups may have when converting raw video into useable, annotated datasets.
计算机视觉(CV)是人工智能的一个分支,它对图像或视频数据进行计算,允许对视觉信息进行定量分析。可能与外科医生相关的常见 CV 任务包括图像分类、目标检测和跟踪,以及提取更高阶特征。然而,尽管 CV 有可能应用于术中视频,但很少有外科医生描述 CV 的使用情况。在实施 CV 方面的一个主要障碍是缺乏明确的工作流程来创建可应用于 CV 的术中视频数据集。我们报告了创建可用手术视频数据集的一般原则及其应用结果。
对尸体内镜经鼻颅底模拟手术(n=20 次 1-5 分钟,大小=8GB)的视频注释由 2 名研究人员注释员进行了回顾。对开发术中视频注释的工作流程进行了内部、回顾性分析,以确定指导实践。
注释了大约 34000 帧手术视频。在开发注释工作流程时需要考虑的关键因素包括 1)克服软件和人员限制;2)确保足够的存储和访问基础设施;3)优化和标准化注释协议;以及 4)使注释数据投入使用。潜在的工具包括 CVAT(计算机视觉注释工具)和 Vott:开源注释软件,允许本地视频存储、轻松设置和使用插值。
CV 技术可应用于手术视频,但新手用户的挑战可能会限制其采用。我们概述了注释工作流程中的原则,这些原则可以减轻小组在将原始视频转换为可用的注释数据集时可能遇到的初始挑战。