Suppr超能文献

用于增强婴儿一般运动评估的半监督身体解析与姿势估计

Semi-supervised body parsing and pose estimation for enhancing infant general movement assessment.

作者信息

Ni Haomiao, Xue Yuan, Ma Liya, Zhang Qian, Li Xiaoye, Huang Sharon X

机构信息

College of Information Sciences and Technology, The Pennsylvania State University, University Park, PA, USA.

College of Information Sciences and Technology, The Pennsylvania State University, University Park, PA, USA; Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD, USA.

出版信息

Med Image Anal. 2023 Jan;83:102654. doi: 10.1016/j.media.2022.102654. Epub 2022 Oct 14.

Abstract

General movement assessment (GMA) of infant movement videos (IMVs) is an effective method for early detection of cerebral palsy (CP) in infants. We demonstrate in this paper that end-to-end trainable neural networks for image sequence recognition can be applied to achieve good results in GMA, and more importantly, augmenting raw video with infant body parsing and pose estimation information can significantly improve performance. To solve the problem of efficiently utilizing partially labeled IMVs for body parsing, we propose a semi-supervised model, termed SiamParseNet (SPN), which consists of two branches, one for intra-frame body parts segmentation and another for inter-frame label propagation. During training, the two branches are jointly trained by alternating between using input pairs of only labeled frames and input of both labeled and unlabeled frames. We also investigate training data augmentation by proposing a factorized video generative adversarial network (FVGAN) to synthesize novel labeled frames for training. FVGAN decouples foreground and background generation which allows for generating multiple labeled frames from one real labeled frame. When testing, we employ a multi-source inference mechanism, where the final result for a test frame is either obtained via the segmentation branch or via propagation from a nearby key frame. We conduct extensive experiments for body parsing using SPN on two infant movement video datasets; on these partially labeled IMVs, we show that SPN coupled with FVGAN achieves state-of-the-art performance. We further demonstrate that our proposed SPN can be easily adapted to the infant pose estimation task with superior performance. Last but not least, we explore the clinical application of our method for GMA. We collected a new clinical IMV dataset with GMA annotations, and our experiments show that our SPN models for body parsing and pose estimation trained on the first two datasets generalize well to the new clinical dataset and their results can significantly boost the convolutional recurrent neural network (CRNN) based GMA prediction performance when combined with raw video inputs.

摘要

婴儿运动视频(IMV)的一般运动评估(GMA)是早期检测婴儿脑瘫(CP)的有效方法。我们在本文中证明,用于图像序列识别的端到端可训练神经网络可应用于GMA并取得良好效果,更重要的是,用婴儿身体解析和姿态估计信息增强原始视频可以显著提高性能。为了解决有效利用部分标记的IMV进行身体解析的问题,我们提出了一种半监督模型,称为暹罗解析网络(SPN),它由两个分支组成,一个用于帧内身体部位分割,另一个用于帧间标签传播。在训练期间,通过在仅使用标记帧的输入对与标记帧和未标记帧的输入之间交替,对两个分支进行联合训练。我们还通过提出一种因式分解视频生成对抗网络(FVGAN)来研究训练数据增强,以合成用于训练的新标记帧。FVGAN解耦前景和背景生成,这允许从一个真实标记帧生成多个标记帧。在测试时,我们采用多源推理机制,其中测试帧的最终结果要么通过分割分支获得,要么通过从附近关键帧传播获得。我们使用SPN在两个婴儿运动视频数据集上对身体解析进行了广泛实验;在这些部分标记的IMV上,我们表明SPN与FVGAN相结合实现了最优性能。我们进一步证明,我们提出的SPN可以很容易地适应婴儿姿态估计任务,并具有卓越的性能。最后但同样重要的是,我们探索了我们的方法在GMA中的临床应用。我们收集了一个带有GMA注释的新临床IMV数据集,我们的实验表明,我们在前两个数据集上训练的用于身体解析和姿态估计的SPN模型可以很好地推广到新的临床数据集,并且当与原始视频输入相结合时,它们的结果可以显著提高基于卷积循环神经网络(CRNN)的GMA预测性能。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验