Suppr超能文献

手术视频中基于时间引导的 articulated hand pose 跟踪。

Temporally guided articulated hand pose tracking in surgical videos.

机构信息

EECS, University of Michigan, Ann Arbor, MI, USA.

Cloud and AI, Microsoft, Redmond, WA, USA.

出版信息

Int J Comput Assist Radiol Surg. 2023 Jan;18(1):117-125. doi: 10.1007/s11548-022-02761-6. Epub 2022 Oct 3.

Abstract

PURPOSE

Articulated hand pose tracking is an under-explored problem that carries the potential for use in an extensive number of applications, especially in the medical domain. With a robust and accurate tracking system on surgical videos, the motion dynamics and movement patterns of the hands can be captured and analyzed for many rich tasks.

METHODS

In this work, we propose a novel hand pose estimation model, CondPose, which improves detection and tracking accuracy by incorporating a pose prior into its prediction. We show improvements over state-of-the-art methods which provide frame-wise independent predictions, by following a temporally guided approach that effectively leverages past predictions.

RESULTS

We collect Surgical Hands, the first dataset that provides multi-instance articulated hand pose annotations for videos. Our dataset provides over 8.1k annotated hand poses from publicly available surgical videos and bounding boxes, pose annotations, and tracking IDs to enable multi-instance tracking. When evaluated on Surgical Hands, we show our method outperforms the state-of-the-art approach using mean Average Precision, to measure pose estimation accuracy, and Multiple Object Tracking Accuracy, to assess pose tracking performance.

CONCLUSION

In comparison to a frame-wise independent strategy, we show greater performance in detecting and tracking hand poses and more substantial impact on localization accuracy. This has positive implications in generating more accurate representations of hands in the scene to be used for targeted downstream tasks.

摘要

目的

关节手姿势跟踪是一个研究不足的问题,但它具有在广泛应用中的潜力,特别是在医学领域。通过对手术视频进行强大而准确的跟踪系统,可以捕捉和分析手部的运动动态和运动模式,以实现许多丰富的任务。

方法

在这项工作中,我们提出了一种新的手姿势估计模型 CondPose,它通过将姿势先验信息纳入其预测中,提高了检测和跟踪的准确性。我们通过采用一种时间引导的方法,有效地利用过去的预测,从而提高了优于现有方法的性能,这些方法提供了独立于每一帧的预测。

结果

我们收集了 Surgical Hands,这是第一个提供用于视频的多实例关节手姿势注释的数据集。我们的数据集提供了超过 8100 个来自公开手术视频的注释手姿势以及边界框、姿势注释和跟踪 ID,以实现多实例跟踪。在 Surgical Hands 上进行评估时,我们的方法在使用平均精度(用于测量姿势估计准确性)和多目标跟踪准确性(用于评估姿势跟踪性能)方面优于最先进的方法。

结论

与独立于每一帧的策略相比,我们在手姿势的检测和跟踪方面表现出更好的性能,并对手部位置的准确性产生更大的影响。这对于生成更准确的手部场景表示,以便用于有针对性的下游任务具有积极的意义。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b8cd/9883342/d5b29cc19e60/11548_2022_2761_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验