Department of Biomedical Data Science, Stanford University, Palo Alto, CA, USA.
Department of Ophthalmology, Byers Eye Institute, Stanford University, Palo Alto, CA, USA.
Transl Vis Sci Technol. 2023 Mar 1;12(3):23. doi: 10.1167/tvst.12.3.23.
The purpose of this study was to build a deep-learning model that automatically analyzes cataract surgical videos for the locations of surgical landmarks, and to derive skill-related motion metrics.
The locations of the pupil, limbus, and 8 classes of surgical instruments were identified by a 2-step algorithm: (1) mask segmentation and (2) landmark identification from the masks. To perform mask segmentation, we trained the YOLACT model on 1156 frames sampled from 268 videos and the public Cataract Dataset for Image Segmentation (CaDIS) dataset. Landmark identification was performed by fitting ellipses or lines to the contours of the masks and deriving locations of interest, including surgical tooltips and the pupil center. Landmark identification was evaluated by the distance between the predicted and true positions in 5853 frames of 10 phacoemulsification video clips. We derived the total path length, maximal speed, and covered area using the tip positions and examined the correlation with human-rated surgical performance.
The mean average precision score and intersection-over-union for mask detection were 0.78 and 0.82. The average distance between the predicted and true positions of the pupil center, phaco tip, and second instrument tip was 5.8, 9.1, and 17.1 pixels. The total path length and covered areas of these landmarks were negatively correlated with surgical performance.
We developed a deep-learning method to localize key anatomical portions of the eye and cataract surgical tools, which can be used to automatically derive metrics correlated with surgical skill.
Our system could form the basis of an automated feedback system that helps cataract surgeons evaluate their performance.
本研究旨在构建一种深度学习模型,该模型可自动分析白内障手术视频中手术标志的位置,并得出与技能相关的运动指标。
通过两步算法(1)掩模分割和(2)从掩模中识别地标,确定瞳孔、角膜缘和 8 类手术器械的位置。为了进行掩模分割,我们在 268 个视频中抽取的 1156 个帧和公共白内障图像分割数据集(CaDIS)上训练了 YOLACT 模型。通过拟合掩模的轮廓来进行地标识别,得出感兴趣的位置,包括手术器械尖端和瞳孔中心。在 10 个超声乳化视频剪辑的 5853 个帧中,通过预测位置与真实位置之间的距离评估地标识别。我们使用尖端位置计算总路径长度、最大速度和覆盖面积,并检查与人为评定的手术性能的相关性。
掩模检测的平均精度和交并比分别为 0.78 和 0.82。瞳孔中心、超声乳化器械尖端和第二器械尖端的预测位置与真实位置之间的平均距离分别为 5.8、9.1 和 17.1 像素。这些地标物的总路径长度和覆盖面积与手术性能呈负相关。
我们开发了一种定位眼部和白内障手术工具关键解剖部位的深度学习方法,该方法可用于自动推导与手术技能相关的指标。
杨雪