• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于视频场景图生成的轨迹对提议和上下文推理。

Tracklet Pair Proposal and Context Reasoning for Video Scene Graph Generation.

机构信息

Department of Computer Science, Kyonggi University, Suwon-si 16227, Korea.

出版信息

Sensors (Basel). 2021 May 2;21(9):3164. doi: 10.3390/s21093164.

DOI:10.3390/s21093164
PMID:34063299
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8124611/
Abstract

Video scene graph generation (ViDSGG), the creation of video scene graphs that helps in deeper and better visual scene understanding, is a challenging task. Segment-based and sliding-window based methods have been proposed to perform this task. However, they all have certain limitations. This study proposes a novel deep neural network model called VSGG-Net for video scene graph generation. The model uses a sliding window scheme to detect object tracklets of various lengths throughout the entire video. In particular, the proposed model presents a new tracklet pair proposal method that evaluates the relatedness of object tracklet pairs using a pretrained neural network and statistical information. To effectively utilize the spatio-temporal context, low-level visual context reasoning is performed using a spatio-temporal context graph and a graph neural network as well as high-level semantic context reasoning. To improve the detection performance for sparse relationships, the proposed model applies a class weighting technique that adjusts the weight of sparse relationships to a higher level. This study demonstrates the positive effect and high performance of the proposed model through experiments using the benchmark dataset VidOR and VidVRD.

摘要

视频场景图生成(ViDSGG),即创建有助于更深入和更好地理解视觉场景的视频场景图,是一项具有挑战性的任务。已经提出了基于分段和滑动窗口的方法来执行此任务。然而,它们都有一定的局限性。本研究提出了一种名为 VSGG-Net 的新型深度神经网络模型,用于视频场景图生成。该模型使用滑动窗口方案在整个视频中检测各种长度的对象轨迹。特别是,所提出的模型提出了一种新的轨迹对提议方法,该方法使用预训练的神经网络和统计信息评估对象轨迹对的相关性。为了有效地利用时空上下文,使用时空上下文图和图神经网络以及高级语义上下文推理来执行低级视觉上下文推理。为了提高稀疏关系的检测性能,所提出的模型应用了一种类别加权技术,将稀疏关系的权重调整到更高的水平。本研究通过使用基准数据集 VidOR 和 VidVRD 进行的实验,证明了所提出模型的积极效果和高性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ba85/8124611/6c9d2b2e9f73/sensors-21-03164-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ba85/8124611/f2b93a777dbd/sensors-21-03164-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ba85/8124611/2d918f76181d/sensors-21-03164-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ba85/8124611/a2191be462e3/sensors-21-03164-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ba85/8124611/9835e502d75c/sensors-21-03164-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ba85/8124611/01b2c449cc61/sensors-21-03164-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ba85/8124611/737931cdde60/sensors-21-03164-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ba85/8124611/9f600920ac1a/sensors-21-03164-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ba85/8124611/6c9d2b2e9f73/sensors-21-03164-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ba85/8124611/f2b93a777dbd/sensors-21-03164-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ba85/8124611/2d918f76181d/sensors-21-03164-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ba85/8124611/a2191be462e3/sensors-21-03164-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ba85/8124611/9835e502d75c/sensors-21-03164-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ba85/8124611/01b2c449cc61/sensors-21-03164-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ba85/8124611/737931cdde60/sensors-21-03164-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ba85/8124611/9f600920ac1a/sensors-21-03164-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ba85/8124611/6c9d2b2e9f73/sensors-21-03164-g008.jpg

相似文献

1
Tracklet Pair Proposal and Context Reasoning for Video Scene Graph Generation.用于视频场景图生成的轨迹对提议和上下文推理。
Sensors (Basel). 2021 May 2;21(9):3164. doi: 10.3390/s21093164.
2
Sparse Spatial-Temporal Emotion Graph Convolutional Network for Video Emotion Recognition.稀疏时空情感图卷积网络的视频情感识别。
Comput Intell Neurosci. 2022 Sep 28;2022:3518879. doi: 10.1155/2022/3518879. eCollection 2022.
3
Pair Then Relation: Pair-Net for Panoptic Scene Graph Generation.配对关系:用于全景场景图生成的配对网络。
IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):10452-10465. doi: 10.1109/TPAMI.2024.3442301. Epub 2024 Nov 6.
4
Spatial-Temporal Knowledge-Embedded Transformer for Video Scene Graph Generation.用于视频场景图生成的时空知识嵌入变换器
IEEE Trans Image Process. 2023 Dec 28;PP. doi: 10.1109/TIP.2023.3345652.
5
Graph-Based Visual Manipulation Relationship Reasoning Network for Robotic Grasping.用于机器人抓取的基于图的视觉操作关系推理网络
Front Neurorobot. 2021 Aug 13;15:719731. doi: 10.3389/fnbot.2021.719731. eCollection 2021.
6
HAtt-Flow: Hierarchical Attention-Flow Mechanism for Group-Activity Scene Graph Generation in Videos.HAtt-Flow:用于视频中群体活动场景图生成的分层注意力流机制
Sensors (Basel). 2024 May 24;24(11):3372. doi: 10.3390/s24113372.
7
Path-based knowledge reasoning with textual semantic information for medical knowledge graph completion.基于路径的知识推理与文本语义信息融合的医疗知识图谱补全方法
BMC Med Inform Decis Mak. 2021 Nov 29;21(Suppl 9):335. doi: 10.1186/s12911-021-01622-7.
8
Fine-Grained Video Retrieval With Scene Sketches.基于场景草图的细粒度视频检索。
IEEE Trans Image Process. 2023;32:3136-3149. doi: 10.1109/TIP.2023.3278474. Epub 2023 Jun 2.
9
Spatio-Temporal Action Detection in Untrimmed Videos by Using Multimodal Features and Region Proposals.基于多模态特征和区域建议的非裁剪视频时空动作检测。
Sensors (Basel). 2019 Mar 3;19(5):1085. doi: 10.3390/s19051085.
10
Cross-Attentional Spatio-Temporal Semantic Graph Networks for Video Question Answering.用于视频问答的交叉注意力时空语义图网络
IEEE Trans Image Process. 2022;31:1684-1696. doi: 10.1109/TIP.2022.3142526. Epub 2022 Feb 3.

引用本文的文献

1
Development and validation of predictive model based on deep learning method for classification of dyslipidemia in Chinese medicine.基于深度学习方法的中医血脂异常分类预测模型的构建与验证
Health Inf Sci Syst. 2023 Apr 6;11(1):21. doi: 10.1007/s13755-023-00215-0. eCollection 2023 Dec.

本文引用的文献

1
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.更快的 R-CNN:基于区域建议网络的实时目标检测。
IEEE Trans Pattern Anal Mach Intell. 2017 Jun;39(6):1137-1149. doi: 10.1109/TPAMI.2016.2577031. Epub 2016 Jun 6.