Suppr超能文献

将人机交互基于多模态数据集中的可供性(affordance)行为。

Grounding human-object interaction to affordance behavior in multimodal datasets.

作者信息

Henlein Alexander, Gopinath Anju, Krishnaswamy Nikhil, Mehler Alexander, Pustejovsky James

机构信息

Text Technology Lab, Faculty of Computer Science and Mathematics, Institute of Computer Science, Goethe University Frankfurt, Frankfurt, Germany.

Situated Grounding and Natural Language Lab, Department of Computer Science, Colorado State University, Fort Collins, CO, United States.

出版信息

Front Artif Intell. 2023 Jan 30;6:1084740. doi: 10.3389/frai.2023.1084740. eCollection 2023.

Abstract

While affordance detection and Human-Object interaction (HOI) detection tasks are related, the theoretical foundation of affordances makes it clear that the two are distinct. In particular, researchers in affordances make distinctions between J. J. Gibson's traditional definition of an affordance, "the action possibilities" of the object within the environment, and the definition of a affordance, or one defined by conventionalized purpose or use. We augment the HICO-DET dataset with annotations for Gibsonian and telic affordances and a subset of the dataset with annotations for the orientation of the humans and objects involved. We then train an adapted Human-Object Interaction (HOI) model and evaluate a pre-trained viewpoint estimation system on this augmented dataset. Our model, AffordanceUPT, is based on a two-stage adaptation of the Unary-Pairwise Transformer (UPT), which we modularize to make affordance detection independent of object detection. Our approach exhibits generalization to new objects and actions, can effectively make the Gibsonian/telic distinction, and shows that this distinction is correlated with features in the data that are not captured by the HOI annotations of the HICO-DET dataset.

摘要

虽然可供性检测和人与物体交互(HOI)检测任务相关,但可供性的理论基础明确表明两者是不同的。特别是,可供性研究人员区分了J. J. 吉布森对可供性的传统定义,即物体在环境中的“行动可能性”,以及一种可供性的定义,或由常规目的或用途定义的可供性。我们用吉布森式和目的论可供性的注释扩充了HICO-DET数据集,并对数据集中涉及的人和物体的方向注释进行了子集扩充。然后,我们训练了一个经过改编的人与物体交互(HOI)模型,并在这个扩充后的数据集上评估一个预训练的视点估计系统。我们的模型AffordanceUPT基于一元-成对变压器(UPT)的两阶段改编,我们将其模块化以使可供性检测独立于物体检测。我们的方法对新物体和新动作具有泛化能力,能够有效地区分吉布森式/目的论可供性,并表明这种区分与HICO-DET数据集的HOI注释未捕捉到的数据特征相关。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验