• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

运动自适应帧识别(KAFR):一种通过帧相似度和手术工具跟踪进行视频分割的新型框架。

Kinematic Adaptive Frame Recognition (KAFR): A Novel Framework for Video Segmentation via Frame Similarity and Surgical Tool Tracking.

作者信息

Nguyen Huu Phong, Khairnar Shekhar Madhav, Palacios Sofia Garces, Al-Abbas Amr, Hogg Melissa E, Zureikat Amer H, Polanco Patricio M, Zeh Herbert J, Sankaranarayanan Ganesh

机构信息

Department of Surgery, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA.

Department of Surgery, NorthShore University HealthSystem, Evanston, IL 60201, USA.

出版信息

IEEE Access. 2025;13:101681-101697. doi: 10.1109/access.2025.3573264. Epub 2025 May 23.

DOI:10.1109/access.2025.3573264
PMID:40852196
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12369944/
Abstract

The interest in leveraging Artificial Intelligence (AI) for surgical procedures to automate analysis has witnessed a significant surge in recent years. One of the primary tools for recording surgical procedures and conducting subsequent analyses, such as performance assessment, is through videos. However, these operative videos tend to be notably lengthy compared to other fields, spanning from thirty minutes to several hours, which poses a challenge for AI models to effectively learn from them. Despite this challenge, the foreseeable increase in the volume of such videos in the near future necessitates the development and implementation of innovative techniques to tackle this issue effectively. In this article, we propose a novel technique called Kinematics Adaptive Frame Recognition (KAFR) that can efficiently eliminate redundant frames to reduce dataset size and computation time while retaining useful frames to improve accuracy. Specifically, we compute the similarity between consecutive frames by tracking the movement of surgical tools. Our approach follows these steps:1) Tracking phase: a YOLOv8 model is utilized to detect tools presented in the scene, 2) Similarity phase: Similarities between consecutive frames are computed by estimating variation in the spatial positions and velocities of the tools, 3) Classification phase: An X3D CNN is trained to classify segmentation. We evaluate the effectiveness of our approach by analyzing datasets obtained through retrospective reviews of cases at two referral centers. The newly annotated Gastrojejunostomy (GJ) dataset covers procedures performed between 2017 and 2021, while the previously annotated Pancreaticojejunostomy (PJ) dataset spans from 2011 to 2022 at the same centers. In the GJ dataset, each robotic GJ video is segmented into six distinct phases. By adaptively selecting relevant frames, we achieve a reduction in the number of frames while improving by 4.32% (from 0.749 to 0.7814) and the F1 score by 0.16%. Our approach is also evaluated on the PJ dataset, demonstrating its efficacy with a fivefold reduction of data and a 2.05% accuracy improvement (from 0.8801 to 0.8982), along with 2.54% increase in F1 score (from 0.8534 to 0.8751). In addition, we also compare our approach with the state-of-the-art approaches to highlight its competitiveness in terms of performance and efficiency. Although we examined our approach on the GJ and PJ datasets for phase segmentation, this could also be applied to broader, more general surgical datasets. Furthermore, KAFR can serve as a supplement to existing approaches, enhancing their performance by reducing redundant data while retaining key information, making it a valuable addition to other AI models.

摘要

近年来,利用人工智能(AI)实现手术过程自动化分析的兴趣显著激增。记录手术过程并进行后续分析(如性能评估)的主要工具之一是视频。然而,与其他领域相比,这些手术视频往往长得多,从三十分钟到几个小时不等,这给人工智能模型从中有效学习带来了挑战。尽管存在这一挑战,但鉴于近期此类视频数量预计会增加,有必要开发并实施创新技术来有效解决这一问题。在本文中,我们提出了一种名为运动学自适应帧识别(KAFR)的新技术,该技术可以有效消除冗余帧,以减少数据集大小和计算时间,同时保留有用帧以提高准确性。具体而言,我们通过跟踪手术工具的移动来计算连续帧之间的相似度。我们的方法遵循以下步骤:1)跟踪阶段:利用YOLOv8模型检测场景中出现的工具;2)相似度阶段:通过估计工具空间位置和速度的变化来计算连续帧之间的相似度;3)分类阶段:训练一个X3D卷积神经网络(CNN)进行分割分类。我们通过分析从两个转诊中心的病例回顾性研究中获得的数据集来评估我们方法的有效性。新注释的胃空肠吻合术(GJ)数据集涵盖了2017年至2021年期间进行的手术,而之前注释的胰空肠吻合术(PJ)数据集则涵盖了同一中心2011年至2022年期间的手术。在GJ数据集中,每个机器人辅助GJ视频被分割为六个不同阶段。通过自适应选择相关帧,我们在减少帧数的同时,准确率提高了4.32%(从0.749提高到0.7814),F1分数提高了0.16%。我们的方法也在PJ数据集上进行了评估,结果表明其有效性,数据量减少了五倍,准确率提高了2.05%(从0.8801提高到0.8982),F1分数提高了2.54%(从0.8534提高到0.8751)。此外,我们还将我们的方法与现有最先进的方法进行比较,以突出其在性能和效率方面的竞争力。尽管我们在GJ和PJ数据集上对我们的方法进行了阶段分割测试,但该方法也可应用于更广泛、更通用的手术数据集。此外,KAFR可以作为现有方法的补充,通过减少冗余数据同时保留关键信息来提高其性能,使其成为其他人工智能模型的宝贵补充。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2151/12369944/a4c6cb5251f9/nihms-2090652-f0014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2151/12369944/7ca7d0f5c13b/nihms-2090652-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2151/12369944/708ad25b52f2/nihms-2090652-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2151/12369944/0e9ac47bd17d/nihms-2090652-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2151/12369944/5299896d326c/nihms-2090652-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2151/12369944/80269155897a/nihms-2090652-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2151/12369944/f237b26a1ca8/nihms-2090652-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2151/12369944/1cd2b97c1bcb/nihms-2090652-f0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2151/12369944/c8e95dc1a136/nihms-2090652-f0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2151/12369944/4c1b5d056d08/nihms-2090652-f0009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2151/12369944/64e894828c23/nihms-2090652-f0010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2151/12369944/a6585efbf792/nihms-2090652-f0011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2151/12369944/ae0aa0aee51c/nihms-2090652-f0012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2151/12369944/8f16f7e77964/nihms-2090652-f0013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2151/12369944/a4c6cb5251f9/nihms-2090652-f0014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2151/12369944/7ca7d0f5c13b/nihms-2090652-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2151/12369944/708ad25b52f2/nihms-2090652-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2151/12369944/0e9ac47bd17d/nihms-2090652-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2151/12369944/5299896d326c/nihms-2090652-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2151/12369944/80269155897a/nihms-2090652-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2151/12369944/f237b26a1ca8/nihms-2090652-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2151/12369944/1cd2b97c1bcb/nihms-2090652-f0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2151/12369944/c8e95dc1a136/nihms-2090652-f0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2151/12369944/4c1b5d056d08/nihms-2090652-f0009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2151/12369944/64e894828c23/nihms-2090652-f0010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2151/12369944/a6585efbf792/nihms-2090652-f0011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2151/12369944/ae0aa0aee51c/nihms-2090652-f0012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2151/12369944/8f16f7e77964/nihms-2090652-f0013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2151/12369944/a4c6cb5251f9/nihms-2090652-f0014.jpg

相似文献

1
Kinematic Adaptive Frame Recognition (KAFR): A Novel Framework for Video Segmentation via Frame Similarity and Surgical Tool Tracking.运动自适应帧识别(KAFR):一种通过帧相似度和手术工具跟踪进行视频分割的新型框架。
IEEE Access. 2025;13:101681-101697. doi: 10.1109/access.2025.3573264. Epub 2025 May 23.
2
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
3
Short-Term Memory Impairment短期记忆障碍
4
Leveraging a foundation model zoo for cell similarity search in oncological microscopy across devices.利用基础模型库进行跨设备肿瘤显微镜检查中的细胞相似性搜索。
Front Oncol. 2025 Jun 18;15:1480384. doi: 10.3389/fonc.2025.1480384. eCollection 2025.
5
DeePosit, an AI-based tool for detecting mouse urine and fecal depositions from thermal video clips of behavioral experiments.DeePosit是一种基于人工智能的工具,用于从行为实验的热视频片段中检测小鼠尿液和粪便沉积。
Elife. 2025 Aug 28;13:RP100739. doi: 10.7554/eLife.100739.
6
Anterior Approach Total Ankle Arthroplasty with Patient-Specific Cut Guides.使用患者特异性截骨导向器的前路全踝关节置换术。
JBJS Essent Surg Tech. 2025 Aug 15;15(3). doi: 10.2106/JBJS.ST.23.00027. eCollection 2025 Jul-Sep.
7
A Comprehensive and Modality Diverse Cervical Spine and Back Musculoskeletal Physical Exam Curriculum for Medical Students.面向医学生的全面且多模态的颈椎和背部肌肉骨骼物理检查课程
J Educ Teach Emerg Med. 2025 Jul 31;10(3):SG1-SG8. doi: 10.21980/J8RQ0N. eCollection 2025 Jul.
8
Genetic determinants of testicular sperm extraction outcomes: insights from a large multicentre study of men with non-obstructive azoospermia.睾丸精子提取结果的遗传决定因素:来自一项针对非梗阻性无精子症男性的大型多中心研究的见解
Hum Reprod Open. 2025 Aug 29;2025(3):hoaf049. doi: 10.1093/hropen/hoaf049. eCollection 2025.
9
Healthcare workers' informal uses of mobile phones and other mobile devices to support their work: a qualitative evidence synthesis.医护人员非正规使用手机和其他移动设备来支持工作:定性证据综合评价。
Cochrane Database Syst Rev. 2024 Aug 27;8(8):CD015705. doi: 10.1002/14651858.CD015705.pub2.
10
Are Current Survival Prediction Tools Useful When Treating Subsequent Skeletal-related Events From Bone Metastases?当前的生存预测工具在治疗骨转移后的骨骼相关事件时有用吗?
Clin Orthop Relat Res. 2024 Sep 1;482(9):1710-1721. doi: 10.1097/CORR.0000000000003030. Epub 2024 Mar 22.

本文引用的文献

1
LoViT: Long Video Transformer for surgical phase recognition.LoViT:用于手术阶段识别的长视频 Transformer。
Med Image Anal. 2025 Jan;99:103366. doi: 10.1016/j.media.2024.103366. Epub 2024 Oct 5.
2
The development of a deep learning model for automated segmentation of the robotic pancreaticojejunostomy.开发用于机器人胰肠吻合术自动分割的深度学习模型。
Surg Endosc. 2024 May;38(5):2553-2561. doi: 10.1007/s00464-024-10725-x. Epub 2024 Mar 15.
3
Clinical applications of artificial intelligence in robotic surgery.人工智能在机器人手术中的临床应用。
J Robot Surg. 2024 Mar 1;18(1):102. doi: 10.1007/s11701-024-01867-0.
4
Surgical gestures can be used to assess surgical competence in robot-assisted surgery : A validity investigating study of simulated RARP.手术手势可用于评估机器人辅助手术中的手术能力:模拟 RARP 的有效性研究。
J Robot Surg. 2024 Jan 20;18(1):47. doi: 10.1007/s11701-023-01807-4.
5
SAGES peritoneal dialysis access guideline update 2023.2023年SAGES腹膜透析通路指南更新
Surg Endosc. 2024 Jan;38(1):1-23. doi: 10.1007/s00464-023-10550-8. Epub 2023 Nov 21.
6
Automated segmentation of phases, steps, and tasks in laparoscopic cholecystectomy using deep learning.使用深度学习对腹腔镜胆囊切除术的阶段、步骤和任务进行自动分割。
Surg Endosc. 2024 Jan;38(1):158-170. doi: 10.1007/s00464-023-10482-3. Epub 2023 Nov 9.
7
Video action recognition collaborative learning with dynamics via PSO-ConvNet Transformer.基于粒子群优化卷积神经网络变压器的带动力学的视频动作识别协同学习
Sci Rep. 2023 Sep 5;13(1):14624. doi: 10.1038/s41598-023-39744-9.
8
Deep Learning in Surgical Workflow Analysis: A Review of Phase and Step Recognition.深度学习在手术流程分析中的应用:相位和步骤识别综述。
IEEE J Biomed Health Inform. 2023 Nov;27(11):5405-5417. doi: 10.1109/JBHI.2023.3311628. Epub 2023 Nov 7.
9
A spatio-temporal network for video semantic segmentation in surgical videos.用于手术视频中视频语义分割的时空网络。
Int J Comput Assist Radiol Surg. 2024 Feb;19(2):375-382. doi: 10.1007/s11548-023-02971-6. Epub 2023 Jun 22.
10
Preserving privacy in surgical video analysis using a deep learning classifier to identify out-of-body scenes in endoscopic videos.使用深度学习分类器在手术视频分析中保护隐私,以识别内窥镜视频中的体外场景。
Sci Rep. 2023 Jun 7;13(1):9235. doi: 10.1038/s41598-023-36453-1.