手术视频中基于时间引导的 articulated hand pose 跟踪。

Temporally guided articulated hand pose tracking in surgical videos.

机构信息

EECS, University of Michigan, Ann Arbor, MI, USA.

Cloud and AI, Microsoft, Redmond, WA, USA.

出版信息

Int J Comput Assist Radiol Surg. 2023 Jan;18(1):117-125. doi: 10.1007/s11548-022-02761-6. Epub 2022 Oct 3.

DOI:10.1007/s11548-022-02761-6

PMID:36190616

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9883342/

Abstract

PURPOSE

Articulated hand pose tracking is an under-explored problem that carries the potential for use in an extensive number of applications, especially in the medical domain. With a robust and accurate tracking system on surgical videos, the motion dynamics and movement patterns of the hands can be captured and analyzed for many rich tasks.

METHODS

In this work, we propose a novel hand pose estimation model, CondPose, which improves detection and tracking accuracy by incorporating a pose prior into its prediction. We show improvements over state-of-the-art methods which provide frame-wise independent predictions, by following a temporally guided approach that effectively leverages past predictions.

RESULTS

We collect Surgical Hands, the first dataset that provides multi-instance articulated hand pose annotations for videos. Our dataset provides over 8.1k annotated hand poses from publicly available surgical videos and bounding boxes, pose annotations, and tracking IDs to enable multi-instance tracking. When evaluated on Surgical Hands, we show our method outperforms the state-of-the-art approach using mean Average Precision, to measure pose estimation accuracy, and Multiple Object Tracking Accuracy, to assess pose tracking performance.

CONCLUSION

In comparison to a frame-wise independent strategy, we show greater performance in detecting and tracking hand poses and more substantial impact on localization accuracy. This has positive implications in generating more accurate representations of hands in the scene to be used for targeted downstream tasks.

摘要

目的

关节手姿势跟踪是一个研究不足的问题，但它具有在广泛应用中的潜力，特别是在医学领域。通过对手术视频进行强大而准确的跟踪系统，可以捕捉和分析手部的运动动态和运动模式，以实现许多丰富的任务。

方法

在这项工作中，我们提出了一种新的手姿势估计模型 CondPose，它通过将姿势先验信息纳入其预测中，提高了检测和跟踪的准确性。我们通过采用一种时间引导的方法，有效地利用过去的预测，从而提高了优于现有方法的性能，这些方法提供了独立于每一帧的预测。

结果

我们收集了 Surgical Hands，这是第一个提供用于视频的多实例关节手姿势注释的数据集。我们的数据集提供了超过 8100 个来自公开手术视频的注释手姿势以及边界框、姿势注释和跟踪 ID，以实现多实例跟踪。在 Surgical Hands 上进行评估时，我们的方法在使用平均精度（用于测量姿势估计准确性）和多目标跟踪准确性（用于评估姿势跟踪性能）方面优于最先进的方法。

结论

与独立于每一帧的策略相比，我们在手姿势的检测和跟踪方面表现出更好的性能，并对手部位置的准确性产生更大的影响。这对于生成更准确的手部场景表示，以便用于有针对性的下游任务具有积极的意义。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b8cd/9883342/d5b29cc19e60/11548_2022_2761_Fig1_HTML.jpg

相似文献

Temporally guided articulated hand pose tracking in surgical videos.

Int J Comput Assist Radiol Surg. 2023 Jan;18(1):117-125. doi: 10.1007/s11548-022-02761-6. Epub 2022 Oct 3.

Using hand pose estimation to automate open surgery training feedback.

Int J Comput Assist Radiol Surg. 2023 Jul;18(7):1279-1285. doi: 10.1007/s11548-023-02947-6. Epub 2023 May 30.

Using Computer Vision to Automate Hand Detection and Tracking of Surgeon Movements in Videos of Open Surgery.

AMIA Annu Symp Proc. 2021 Jan 25;2020:1373-1382. eCollection 2020.

AlphaPose: Whole-Body Regional Multi-Person Pose Estimation and Tracking in Real-Time.

IEEE Trans Pattern Anal Mach Intell. 2023 Jun;45(6):7157-7173. doi: 10.1109/TPAMI.2022.3222784.

Detection, segmentation, and 3D pose estimation of surgical tools using convolutional neural networks and algebraic geometry.

Med Image Anal. 2021 May;70:101994. doi: 10.1016/j.media.2021.101994. Epub 2021 Feb 7.

Recurrent 3D Hand Pose Estimation Using Cascaded Pose-Guided 3D Alignments.

IEEE Trans Pattern Anal Mach Intell. 2023 Jan;45(1):932-945. doi: 10.1109/TPAMI.2022.3159725. Epub 2022 Dec 5.

HMD-EgoPose: head-mounted display-based egocentric marker-less tool and hand pose estimation for augmented surgical guidance.

Int J Comput Assist Radiol Surg. 2022 Dec;17(12):2253-2262. doi: 10.1007/s11548-022-02688-y. Epub 2022 Jun 14.

A self-supervised spatio-temporal attention network for video-based 3D infant pose estimation.

Med Image Anal. 2024 Aug;96:103208. doi: 10.1016/j.media.2024.103208. Epub 2024 May 18.

Real-time surgical tool tracking and pose estimation using a hybrid cylindrical marker.

Int J Comput Assist Radiol Surg. 2017 Jun;12(6):921-930. doi: 10.1007/s11548-017-1558-9. Epub 2017 Mar 24.

Estimating Player Positions from Padel High-Angle Videos: Accuracy Comparison of Recent Computer Vision Methods.

Sensors (Basel). 2021 May 12;21(10):3368. doi: 10.3390/s21103368.

引用本文的文献

Artificial intelligence integration in surgery through hand and instrument tracking: a systematic literature review.

Front Surg. 2025 Feb 26;12:1528362. doi: 10.3389/fsurg.2025.1528362. eCollection 2025.

The Poses for Equine Research Dataset (PFERD).

Sci Data. 2024 May 15;11(1):497. doi: 10.1038/s41597-024-03312-1.

Holistic OR domain modeling: a semantic scene graph approach.

Int J Comput Assist Radiol Surg. 2024 May;19(5):791-799. doi: 10.1007/s11548-023-03022-w. Epub 2023 Oct 12.

Robust hand tracking for surgical telestration.

Int J Comput Assist Radiol Surg. 2022 Aug;17(8):1477-1486. doi: 10.1007/s11548-022-02637-9. Epub 2022 May 27.

本文引用的文献

Evaluation of Deep Learning Models for Identifying Surgical Actions and Measuring Performance.

JAMA Netw Open. 2020 Mar 2;3(3):e201664. doi: 10.1001/jamanetworkopen.2020.1664.

RASNet: Segmentation for Tracking Surgical Instruments in Surgical Videos Using Refined Attention Segmentation Network.

Annu Int Conf IEEE Eng Med Biol Soc. 2019 Jul;2019:5735-5738. doi: 10.1109/EMBC.2019.8856495.

Brain Tumour Segmentation Using Convolutional Neural Network with Tensor Flow.

Asian Pac J Cancer Prev. 2019 Jul 1;20(7):2095-2101. doi: 10.31557/APJCP.2019.20.7.2095.

OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields.

IEEE Trans Pattern Anal Mach Intell. 2021 Jan;43(1):172-186. doi: 10.1109/TPAMI.2019.2929257. Epub 2020 Dec 4.

Weakly supervised convolutional LSTM approach for tool tracking in laparoscopic videos.

Int J Comput Assist Radiol Surg. 2019 Jun;14(6):1059-1067. doi: 10.1007/s11548-019-01958-6. Epub 2019 Apr 9.

Surgical motion analysis using discriminative interpretable patterns.

Artif Intell Med. 2018 Sep;91:3-11. doi: 10.1016/j.artmed.2018.08.002. Epub 2018 Aug 30.

Using Machine Learning to Assess Physician Competence: A Systematic Review.

Acad Med. 2019 Mar;94(3):427-439. doi: 10.1097/ACM.0000000000002414.

Articulated Multi-Instrument 2-D Pose Estimation Using Fully Convolutional Networks.

IEEE Trans Med Imaging. 2018 May;37(5):1276-1287. doi: 10.1109/TMI.2017.2787672.

Detection and Localization of Robotic Tools in Robot-Assisted Surgery Videos Using Deep Neural Networks for Region Proposal and Detection.

IEEE Trans Med Imaging. 2017 Jul;36(7):1542-1549. doi: 10.1109/TMI.2017.2665671. Epub 2017 Feb 8.

Surgical gesture classification from video and kinematic data.

Med Image Anal. 2013 Oct;17(7):732-45. doi: 10.1016/j.media.2013.04.007. Epub 2013 Apr 28.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

手术视频中基于时间引导的 articulated hand pose 跟踪。

Temporally guided articulated hand pose tracking in surgical videos.

机构信息

出版信息

PURPOSE

METHODS

RESULTS

CONCLUSION

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献