用于视频场景图生成的轨迹对提议和上下文推理。

Tracklet Pair Proposal and Context Reasoning for Video Scene Graph Generation.

机构信息

Department of Computer Science, Kyonggi University, Suwon-si 16227, Korea.

出版信息

Sensors (Basel). 2021 May 2;21(9):3164. doi: 10.3390/s21093164.

DOI:10.3390/s21093164

PMID:34063299

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8124611/

Abstract

Video scene graph generation (ViDSGG), the creation of video scene graphs that helps in deeper and better visual scene understanding, is a challenging task. Segment-based and sliding-window based methods have been proposed to perform this task. However, they all have certain limitations. This study proposes a novel deep neural network model called VSGG-Net for video scene graph generation. The model uses a sliding window scheme to detect object tracklets of various lengths throughout the entire video. In particular, the proposed model presents a new tracklet pair proposal method that evaluates the relatedness of object tracklet pairs using a pretrained neural network and statistical information. To effectively utilize the spatio-temporal context, low-level visual context reasoning is performed using a spatio-temporal context graph and a graph neural network as well as high-level semantic context reasoning. To improve the detection performance for sparse relationships, the proposed model applies a class weighting technique that adjusts the weight of sparse relationships to a higher level. This study demonstrates the positive effect and high performance of the proposed model through experiments using the benchmark dataset VidOR and VidVRD.

摘要

视频场景图生成（ViDSGG），即创建有助于更深入和更好地理解视觉场景的视频场景图，是一项具有挑战性的任务。已经提出了基于分段和滑动窗口的方法来执行此任务。然而，它们都有一定的局限性。本研究提出了一种名为 VSGG-Net 的新型深度神经网络模型，用于视频场景图生成。该模型使用滑动窗口方案在整个视频中检测各种长度的对象轨迹。特别是，所提出的模型提出了一种新的轨迹对提议方法，该方法使用预训练的神经网络和统计信息评估对象轨迹对的相关性。为了有效地利用时空上下文，使用时空上下文图和图神经网络以及高级语义上下文推理来执行低级视觉上下文推理。为了提高稀疏关系的检测性能，所提出的模型应用了一种类别加权技术，将稀疏关系的权重调整到更高的水平。本研究通过使用基准数据集 VidOR 和 VidVRD 进行的实验，证明了所提出模型的积极效果和高性能。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

用于视频场景图生成的轨迹对提议和上下文推理。

Tracklet Pair Proposal and Context Reasoning for Video Scene Graph Generation.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

用于视频场景图生成的轨迹对提议和上下文推理。

Tracklet Pair Proposal and Context Reasoning for Video Scene Graph Generation.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献