深度学习模型在识别手术动作和测量性能方面的评估。

Evaluation of Deep Learning Models for Identifying Surgical Actions and Measuring Performance.

机构信息

Surgical Safety Technologies, Toronto, Ontario, Canada.

出版信息

JAMA Netw Open. 2020 Mar 2;3(3):e201664. doi: 10.1001/jamanetworkopen.2020.1664.

DOI:10.1001/jamanetworkopen.2020.1664

PMID:32227178

Abstract

IMPORTANCE

When evaluating surgeons in the operating room, experienced physicians must rely on live or recorded video to assess the surgeon's technical performance, an approach prone to subjectivity and error. Owing to the large number of surgical procedures performed daily, it is infeasible to review every procedure; therefore, there is a tremendous loss of invaluable performance data that would otherwise be useful for improving surgical safety.

OBJECTIVE

To evaluate a framework for assessing surgical video clips by categorizing them based on the surgical step being performed and the level of the surgeon's competence.

DESIGN, SETTING, AND PARTICIPANTS: This quality improvement study assessed 103 video clips of 8 surgeons of various levels performing knot tying, suturing, and needle passing from the Johns Hopkins University-Intuitive Surgical Gesture and Skill Assessment Working Set. Data were collected before 2015, and data analysis took place from March to July 2019.

MAIN OUTCOMES AND MEASURES

Deep learning models were trained to estimate categorical outputs such as performance level (ie, novice, intermediate, and expert) and surgical actions (ie, knot tying, suturing, and needle passing). The efficacy of these models was measured using precision, recall, and model accuracy.

RESULTS

The provided architectures achieved accuracy in surgical action and performance calculation tasks using only video input. The embedding representation had a mean (root mean square error [RMSE]) precision of 1.00 (0) for suturing, 0.99 (0.01) for knot tying, and 0.91 (0.11) for needle passing, resulting in a mean (RMSE) precision of 0.97 (0.01). Its mean (RMSE) recall was 0.94 (0.08) for suturing, 1.00 (0) for knot tying, and 0.99 (0.01) for needle passing, resulting in a mean (RMSE) recall of 0.98 (0.01). It also estimated scores on the Objected Structured Assessment of Technical Skill Global Rating Scale categories, with a mean (RMSE) precision of 0.85 (0.09) for novice level, 0.67 (0.07) for intermediate level, and 0.79 (0.12) for expert level, resulting in a mean (RMSE) precision of 0.77 (0.04). Its mean (RMSE) recall was 0.85 (0.05) for novice level, 0.69 (0.14) for intermediate level, and 0.80 (0.13) for expert level, resulting in a mean (RMSE) recall of 0.78 (0.03).

CONCLUSIONS AND RELEVANCE

The proposed models and the accompanying results illustrate that deep machine learning can identify associations in surgical video clips. These are the first steps to creating a feedback mechanism for surgeons that would allow them to learn from their experiences and refine their skills.

摘要

重要性

在手术室评估外科医生时，经验丰富的医生必须依靠实时或录制的视频来评估外科医生的技术表现，这种方法容易出现主观性和误差。由于每天进行的手术数量众多，审查每一个手术都是不切实际的；因此，大量有价值的性能数据会丢失，否则这些数据将有助于提高手术安全性。

目的

通过对正在进行的手术步骤和外科医生的熟练程度进行分类，评估一种评估手术视频片段的框架。

设计、设置和参与者：这项质量改进研究评估了来自不同水平的 8 名外科医生在约翰霍普金斯大学-直觉外科手术手势和技能评估工作集中进行打结、缝合和穿针的 103 个视频片段。数据于 2015 年前收集，数据分析于 2019 年 3 月至 7 月进行。

主要结果和措施

训练深度学习模型来估计类别输出，如绩效水平（即新手、中级和专家）和手术动作（即打结、缝合和穿针）。使用精度、召回率和模型准确性来衡量这些模型的效果。

结果

所提供的架构仅使用视频输入即可在手术动作和性能计算任务中实现准确性。嵌入表示在缝合方面的平均（均方根误差 [RMSE]）精度为 1.00（0），在打结方面为 0.99（0.01），在穿针方面为 0.91（0.11），平均（RMSE）精度为 0.97（0.01）。其平均（RMSE）召回率在缝合方面为 0.94（0.08），在打结方面为 1.00（0），在穿针方面为 0.99（0.01），平均（RMSE）召回率为 0.98（0.01）。它还估计了客观结构化评估技术技能全球评分量表类别的分数，新手级别的平均（RMSE）精度为 0.85（0.09），中级为 0.67（0.07），专家级为 0.79（0.12），平均（RMSE）精度为 0.77（0.04）。其平均（RMSE）召回率在新手级为 0.85（0.05），中级为 0.69（0.14），专家级为 0.80（0.13），平均（RMSE）召回率为 0.78（0.03）。

结论和相关性

所提出的模型和相关结果表明，深度学习可以识别手术视频片段中的关联。这是为外科医生创建反馈机制的第一步，使他们能够从经验中学习并提高技能。

相似文献

Evaluation of Deep Learning Models for Identifying Surgical Actions and Measuring Performance.深度学习模型在识别手术动作和测量性能方面的评估。

JAMA Netw Open. 2020 Mar 2;3(3):e201664. doi: 10.1001/jamanetworkopen.2020.1664.

Endoscopic Image-Based Skill Assessment in Robot-Assisted Minimally Invasive Surgery.基于内镜图像的机器人辅助微创手术技能评估。

Sensors (Basel). 2021 Aug 10;21(16):5412. doi: 10.3390/s21165412.

Video self-assessment of basic suturing and knot tying skills by novice trainees.新手学员通过视频进行基础缝合和打结技能的自我评估。

J Surg Educ. 2013 Mar-Apr;70(2):279-83. doi: 10.1016/j.jsurg.2012.10.003.

Multi-Modal Deep Learning for Assessing Surgeon Technical Skill.多模态深度学习评估外科医生技术水平。

Sensors (Basel). 2022 Sep 27;22(19):7328. doi: 10.3390/s22197328.

Development of a technical checklist for the assessment of suturing in robotic surgery.开发用于评估机器人手术中缝合的技术检查表。

Surg Endosc. 2018 Nov;32(11):4402-4407. doi: 10.1007/s00464-018-6407-6. Epub 2018 Sep 7.

A marker-less technique for measuring kinematics in the operating room.一种用于在手术室中测量运动学的无标记技术。

Surgery. 2016 Nov;160(5):1400-1413. doi: 10.1016/j.surg.2016.05.004. Epub 2016 Jun 21.

Deep neural network architecture for automated soft surgical skills evaluation using objective structured assessment of technical skills criteria.基于客观结构化手术技能评估标准的自动化软外科手术技能评估的深度神经网络架构。

Int J Comput Assist Radiol Surg. 2023 May;18(5):929-937. doi: 10.1007/s11548-022-02827-5. Epub 2023 Jan 25.

Development and validation of a sensor- and expert model-based training system for laparoscopic surgery: the iSurgeon.基于传感器和专家模型的腹腔镜手术训练系统iSurgeon的开发与验证

Surg Endosc. 2017 May;31(5):2155-2165. doi: 10.1007/s00464-016-5213-2. Epub 2016 Sep 7.

Self-directed training with e-learning using the first-person perspective for laparoscopic suturing and knot tying: a randomised controlled trial : Learning from the surgeon's real perspective.使用第一人称视角的电子学习进行自我导向训练，用于腹腔镜缝合和打结：一项随机对照试验：从外科医生的真实视角学习。

Surg Endosc. 2020 Feb;34(2):869-879. doi: 10.1007/s00464-019-06842-7. Epub 2019 May 28.

Motion analysis of the JHU-ISI Gesture and Skill Assessment Working Set using Robotics Video and Motion Assessment Software.使用机器人视频和运动评估软件对 JHU-ISI 手势和技能评估工作集进行运动分析。

Int J Comput Assist Radiol Surg. 2020 Dec;15(12):2017-2025. doi: 10.1007/s11548-020-02259-z. Epub 2020 Oct 6.

引用本文的文献

Artificial intelligence-integrated video analysis of vessel area changes and instrument motion for microsurgical skill assessment.用于显微外科技能评估的人工智能集成视频分析：血管区域变化和器械运动分析

Sci Rep. 2025 Jul 31;15(1):27898. doi: 10.1038/s41598-025-13522-1.

Artificial Intelligence in Medical Education: a Scoping Review of the Evidence for Efficacy and Future Directions.医学教育中的人工智能：疗效证据及未来方向的范围综述

Med Sci Educ. 2025 Apr 2;35(3):1803-1816. doi: 10.1007/s40670-025-02373-0. eCollection 2025 Jun.

Machine learning decision support model construction for craniotomy approach of pineal region tumors based on MRI images.基于MRI图像的松果体区肿瘤开颅手术入路的机器学习决策支持模型构建

BMC Med Imaging. 2025 May 27;25(1):194. doi: 10.1186/s12880-025-01712-2.

Automatic gesture recognition and evaluation in peg transfer tasks of laparoscopic surgery training.腹腔镜手术训练中栓子转移任务的自动手势识别与评估

Surg Endosc. 2025 Jun;39(6):3749-3759. doi: 10.1007/s00464-025-11730-4. Epub 2025 May 2.

Current and Potential Applications of Ambient Artificial Intelligence.环境人工智能的当前及潜在应用

Mayo Clin Proc Digit Health. 2023 Jun 13;1(3):241-246. doi: 10.1016/j.mcpdig.2023.05.003. eCollection 2023 Sep.

Artificial intelligence integration in surgery through hand and instrument tracking: a systematic literature review.通过手部和器械追踪将人工智能整合到手术中：一项系统的文献综述

Front Surg. 2025 Feb 26;12:1528362. doi: 10.3389/fsurg.2025.1528362. eCollection 2025.

Automated analysis of operative video in surgical training: scoping review.手术训练中手术视频的自动化分析：范围综述。

BJS Open. 2024 Sep 3;8(5). doi: 10.1093/bjsopen/zrae124.

Artificial intelligence: revolutionizing robotic surgery: review.人工智能：变革机器人手术：综述

Ann Med Surg (Lond). 2024 Aug 1;86(9):5401-5409. doi: 10.1097/MS9.0000000000002426. eCollection 2024 Sep.

An analysis on the effect of body tissues and surgical tools on workflow recognition in first person surgical videos.对第一人称手术视频中体组织和手术工具对工作流程识别的影响进行分析。

Int J Comput Assist Radiol Surg. 2024 Nov;19(11):2195-2202. doi: 10.1007/s11548-024-03074-6. Epub 2024 Feb 27.

Enhancing surgical performance in cardiothoracic surgery with innovations from computer vision and artificial intelligence: a narrative review.利用计算机视觉和人工智能创新提高心胸外科手术水平：叙述性综述。

J Cardiothorac Surg. 2024 Feb 14;19(1):94. doi: 10.1186/s13019-024-02558-5.

本文引用的文献

SATR-DL: Improving Surgical Skill Assessment And Task Recognition In Robot-Assisted Surgery With Deep Neural Networks.SATR-DL：利用深度神经网络改进机器人辅助手术中的手术技能评估和任务识别

Annu Int Conf IEEE Eng Med Biol Soc. 2018 Jul;2018:1793-1796. doi: 10.1109/EMBC.2018.8512575.

Deep learning with convolutional neural network for objective skill evaluation in robot-assisted surgery.基于卷积神经网络的深度学习在机器人辅助手术中的客观技能评估。

Int J Comput Assist Radiol Surg. 2018 Dec;13(12):1959-1970. doi: 10.1007/s11548-018-1860-1. Epub 2018 Sep 25.

Surgical motion analysis using discriminative interpretable patterns.基于判别可解释模式的手术运动分析。

Artif Intell Med. 2018 Sep;91:3-11. doi: 10.1016/j.artmed.2018.08.002. Epub 2018 Aug 30.

Mask R-CNN.Mask R-CNN。

IEEE Trans Pattern Anal Mach Intell. 2020 Feb;42(2):386-397. doi: 10.1109/TPAMI.2018.2844175. Epub 2018 Jun 5.

Utilizing Machine Learning and Automated Performance Metrics to Evaluate Robot-Assisted Radical Prostatectomy Performance and Predict Outcomes.利用机器学习和自动化性能指标评估机器人辅助根治性前列腺切除术的性能并预测结果。

J Endourol. 2018 May;32(5):438-444. doi: 10.1089/end.2018.0035. Epub 2018 Mar 20.

EndoNet: A Deep Architecture for Recognition Tasks on Laparoscopic Videos.EndoNet：腹腔镜视频识别任务的深度架构。

IEEE Trans Med Imaging. 2017 Jan;36(1):86-97. doi: 10.1109/TMI.2016.2593957. Epub 2016 Jul 22.

Query-by-example surgical activity detection.基于示例的手术活动检测

Int J Comput Assist Radiol Surg. 2016 Jun;11(6):987-96. doi: 10.1007/s11548-016-1386-3. Epub 2016 Apr 12.

Constructing a validity argument for the Objective Structured Assessment of Technical Skills (OSATS): a systematic review of validity evidence.构建客观结构化技术技能评估（OSATS）的效度论证：效度证据的系统评价

Adv Health Sci Educ Theory Pract. 2015 Dec;20(5):1149-75. doi: 10.1007/s10459-015-9593-1. Epub 2015 Feb 22.

Surgical gesture segmentation and recognition.手术手势分割与识别。

Med Image Comput Comput Assist Interv. 2013;16(Pt 3):339-46. doi: 10.1007/978-3-642-40760-4_43.

Using the Objective Structured Assessment of Technical Skills (OSATS) global rating scale to evaluate the skills of surgical trainees in the operating room.使用客观结构化临床技能考核（OSATS）全球评分量表评估手术室中外科学员的技能。

Surg Today. 2013 Mar;43(3):271-5. doi: 10.1007/s00595-012-0313-7. Epub 2012 Sep 1.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

深度学习模型在识别手术动作和测量性能方面的评估。

Evaluation of Deep Learning Models for Identifying Surgical Actions and Measuring Performance.

机构信息

出版信息

IMPORTANCE

OBJECTIVE

MAIN OUTCOMES AND MEASURES

RESULTS

CONCLUSIONS AND RELEVANCE

重要性

目的

主要结果和措施

结果

结论和相关性

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献