IRCAD Strasbourg, Strasbourg, France.
UHN Toronto, Toronto, Canada.
Int J Comput Assist Radiol Surg. 2020 Sep;15(9):1585-1595. doi: 10.1007/s11548-020-02208-w. Epub 2020 Jun 26.
Inexpensive benchtop training systems offer significant advantages to meet the increasing demand of training surgeons and gastroenterologists in flexible endoscopy. Established scoring systems exist, based on task duration and mistake evaluation. However, they require trained human raters, which limits broad and low-cost adoption. There is an unmet and important need to automate rating with machine learning.
We present a general and robust approach for recognizing training tasks from endoscopic training video, which consequently automates task duration computation. Our main technical novelty is to show the performance of state-of-the-art CNN-based approaches can be improved significantly with a novel semi-supervised learning approach, using both labelled and unlabelled videos. In the latter case, we assume only the task execution order is known a priori.
Two video datasets are presented: the first has 19 videos recorded in examination conditions, where the participants complete their tasks in predetermined order. The second has 17 h of videos recorded in self-assessment conditions, where participants complete one or more tasks in any order. For the first dataset, we obtain a mean task duration estimation error of 3.65 s, with a mean task duration of 159 s ([Formula: see text] relative error). For the second dataset, we obtain a mean task duration estimation error of 3.67 s. We reduce an average of 5.63% in error to 3.67% thanks to our semi-supervised learning approach.
This work is the first significant step forward to automate rating of flexible endoscopy students using a low-cost benchtop trainer. Thanks to our semi-supervised learning approach, we can scale easily to much larger unlabelled training datasets. The approach can also be used for other phase recognition tasks.
经济实惠的台式培训系统具有显著优势,可以满足培训外科医生和胃肠病学家进行软式内镜检查的需求。现有的评分系统是基于任务持续时间和错误评估的。然而,它们需要经过培训的人工评估者,这限制了广泛而低成本的采用。因此,需要使用机器学习来实现自动化评分。
我们提出了一种通用且稳健的方法,用于从内镜培训视频中识别培训任务,从而自动计算任务持续时间。我们的主要技术创新是展示基于最先进的 CNN 方法的性能可以通过一种新颖的半监督学习方法得到显著提高,该方法同时使用标记和未标记的视频。在后一种情况下,我们假设仅预先知道任务执行顺序。
提出了两个视频数据集:第一个数据集包含 19 个在检查条件下录制的视频,其中参与者按照预定的顺序完成任务。第二个数据集包含 17 小时在自我评估条件下录制的视频,其中参与者按照任意顺序完成一个或多个任务。对于第一个数据集,我们获得了平均任务持续时间估计误差为 3.65 秒,平均任务持续时间为 159 秒([公式:见正文]相对误差)。对于第二个数据集,我们获得了平均任务持续时间估计误差为 3.67 秒。我们通过半监督学习方法将平均错误减少了 5.63%至 3.67%。
这项工作是朝着使用低成本台式培训器自动评分软式内镜学员迈出的重要一步。由于我们的半监督学习方法,我们可以轻松扩展到更大的未标记培训数据集。该方法还可用于其他阶段识别任务。