Suppr超能文献

深度学习模型在识别手术动作和测量性能方面的评估。

Evaluation of Deep Learning Models for Identifying Surgical Actions and Measuring Performance.

机构信息

Surgical Safety Technologies, Toronto, Ontario, Canada.

出版信息

JAMA Netw Open. 2020 Mar 2;3(3):e201664. doi: 10.1001/jamanetworkopen.2020.1664.

Abstract

IMPORTANCE

When evaluating surgeons in the operating room, experienced physicians must rely on live or recorded video to assess the surgeon's technical performance, an approach prone to subjectivity and error. Owing to the large number of surgical procedures performed daily, it is infeasible to review every procedure; therefore, there is a tremendous loss of invaluable performance data that would otherwise be useful for improving surgical safety.

OBJECTIVE

To evaluate a framework for assessing surgical video clips by categorizing them based on the surgical step being performed and the level of the surgeon's competence.

DESIGN, SETTING, AND PARTICIPANTS: This quality improvement study assessed 103 video clips of 8 surgeons of various levels performing knot tying, suturing, and needle passing from the Johns Hopkins University-Intuitive Surgical Gesture and Skill Assessment Working Set. Data were collected before 2015, and data analysis took place from March to July 2019.

MAIN OUTCOMES AND MEASURES

Deep learning models were trained to estimate categorical outputs such as performance level (ie, novice, intermediate, and expert) and surgical actions (ie, knot tying, suturing, and needle passing). The efficacy of these models was measured using precision, recall, and model accuracy.

RESULTS

The provided architectures achieved accuracy in surgical action and performance calculation tasks using only video input. The embedding representation had a mean (root mean square error [RMSE]) precision of 1.00 (0) for suturing, 0.99 (0.01) for knot tying, and 0.91 (0.11) for needle passing, resulting in a mean (RMSE) precision of 0.97 (0.01). Its mean (RMSE) recall was 0.94 (0.08) for suturing, 1.00 (0) for knot tying, and 0.99 (0.01) for needle passing, resulting in a mean (RMSE) recall of 0.98 (0.01). It also estimated scores on the Objected Structured Assessment of Technical Skill Global Rating Scale categories, with a mean (RMSE) precision of 0.85 (0.09) for novice level, 0.67 (0.07) for intermediate level, and 0.79 (0.12) for expert level, resulting in a mean (RMSE) precision of 0.77 (0.04). Its mean (RMSE) recall was 0.85 (0.05) for novice level, 0.69 (0.14) for intermediate level, and 0.80 (0.13) for expert level, resulting in a mean (RMSE) recall of 0.78 (0.03).

CONCLUSIONS AND RELEVANCE

The proposed models and the accompanying results illustrate that deep machine learning can identify associations in surgical video clips. These are the first steps to creating a feedback mechanism for surgeons that would allow them to learn from their experiences and refine their skills.

摘要

重要性

在手术室评估外科医生时,经验丰富的医生必须依靠实时或录制的视频来评估外科医生的技术表现,这种方法容易出现主观性和误差。由于每天进行的手术数量众多,审查每一个手术都是不切实际的;因此,大量有价值的性能数据会丢失,否则这些数据将有助于提高手术安全性。

目的

通过对正在进行的手术步骤和外科医生的熟练程度进行分类,评估一种评估手术视频片段的框架。

设计、设置和参与者:这项质量改进研究评估了来自不同水平的 8 名外科医生在约翰霍普金斯大学-直觉外科手术手势和技能评估工作集中进行打结、缝合和穿针的 103 个视频片段。数据于 2015 年前收集,数据分析于 2019 年 3 月至 7 月进行。

主要结果和措施

训练深度学习模型来估计类别输出,如绩效水平(即新手、中级和专家)和手术动作(即打结、缝合和穿针)。使用精度、召回率和模型准确性来衡量这些模型的效果。

结果

所提供的架构仅使用视频输入即可在手术动作和性能计算任务中实现准确性。嵌入表示在缝合方面的平均(均方根误差 [RMSE])精度为 1.00(0),在打结方面为 0.99(0.01),在穿针方面为 0.91(0.11),平均(RMSE)精度为 0.97(0.01)。其平均(RMSE)召回率在缝合方面为 0.94(0.08),在打结方面为 1.00(0),在穿针方面为 0.99(0.01),平均(RMSE)召回率为 0.98(0.01)。它还估计了客观结构化评估技术技能全球评分量表类别的分数,新手级别的平均(RMSE)精度为 0.85(0.09),中级为 0.67(0.07),专家级为 0.79(0.12),平均(RMSE)精度为 0.77(0.04)。其平均(RMSE)召回率在新手级为 0.85(0.05),中级为 0.69(0.14),专家级为 0.80(0.13),平均(RMSE)召回率为 0.78(0.03)。

结论和相关性

所提出的模型和相关结果表明,深度学习可以识别手术视频片段中的关联。这是为外科医生创建反馈机制的第一步,使他们能够从经验中学习并提高技能。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验