• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用优化的 Kaneda-Lucas-Tomasi 跟踪器和基于 Denavit-Hartenberg 的运动学模型捕获具身对话代理的会话手势。

Capturing Conversational Gestures for Embodied Conversational Agents Using an Optimized Kaneda-Lucas-Tomasi Tracker and Denavit-Hartenberg-Based Kinematic Model.

机构信息

Faculty of Electrical Engineering and Computer Science, University of Maribor, Koroška c. 46, 2000 Maribor, Slovenia.

出版信息

Sensors (Basel). 2022 Oct 29;22(21):8318. doi: 10.3390/s22218318.

DOI:10.3390/s22218318
PMID:36366016
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9656321/
Abstract

In order to recreate viable and human-like conversational responses, the artificial entity, i.e., an embodied conversational agent, must express correlated speech (verbal) and gestures (non-verbal) responses in spoken social interaction. Most of the existing frameworks focus on intent planning and behavior planning. The realization, however, is left to a limited set of static 3D representations of conversational expressions. In addition to functional and semantic synchrony between verbal and non-verbal signals, the final believability of the displayed expression is sculpted by the physical realization of non-verbal expressions. A major challenge of most conversational systems capable of reproducing gestures is the diversity in expressiveness. In this paper, we propose a method for capturing gestures automatically from videos and transforming them into 3D representations stored as part of the conversational agent's repository of motor skills. The main advantage of the proposed method is ensuring the naturalness of the embodied conversational agent's gestures, which results in a higher quality of human-computer interaction. The method is based on a Kanade-Lucas-Tomasi tracker, a Savitzky-Golay filter, a Denavit-Hartenberg-based kinematic model and the EVA framework. Furthermore, we designed an objective method based on cosine similarity instead of a subjective evaluation of synthesized movement. The proposed method resulted in a 96% similarity.

摘要

为了重新创建可行且类似人类的对话响应,人工智能实体,即具身对话代理,必须在口语社交互动中表达相关的言语(口头)和非言语(非口头)响应。现有的大多数框架都侧重于意图规划和行为规划。然而,实现却留给了有限的一组静态的会话表达 3D 表示。除了言语和非言语信号之间的功能和语义同步外,非言语表达的物理实现还塑造了显示表达的最终可信度。大多数能够再现手势的对话系统的主要挑战是表达的多样性。在本文中,我们提出了一种从视频中自动捕捉手势并将其转换为存储为会话代理运动技能库一部分的 3D 表示的方法。所提出方法的主要优势在于确保具身对话代理的手势的自然性,从而提高人机交互的质量。该方法基于 Kanade-Lucas-Tomasi 跟踪器、Savitzky-Golay 滤波器、基于 Denavit-Hartenberg 的运动学模型和 EVA 框架。此外,我们设计了一种基于余弦相似度的客观方法,而不是对合成运动的主观评估。所提出的方法产生了 96%的相似度。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5288/9656321/0ec3aaece495/sensors-22-08318-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5288/9656321/f378fd3b68d5/sensors-22-08318-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5288/9656321/94b68180e963/sensors-22-08318-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5288/9656321/eb5561cb267b/sensors-22-08318-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5288/9656321/1debd6e244bf/sensors-22-08318-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5288/9656321/7ed2891982f1/sensors-22-08318-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5288/9656321/6fc692daaa67/sensors-22-08318-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5288/9656321/60ba75569c6f/sensors-22-08318-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5288/9656321/0ec3aaece495/sensors-22-08318-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5288/9656321/f378fd3b68d5/sensors-22-08318-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5288/9656321/94b68180e963/sensors-22-08318-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5288/9656321/eb5561cb267b/sensors-22-08318-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5288/9656321/1debd6e244bf/sensors-22-08318-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5288/9656321/7ed2891982f1/sensors-22-08318-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5288/9656321/6fc692daaa67/sensors-22-08318-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5288/9656321/60ba75569c6f/sensors-22-08318-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5288/9656321/0ec3aaece495/sensors-22-08318-g008.jpg

相似文献

1
Capturing Conversational Gestures for Embodied Conversational Agents Using an Optimized Kaneda-Lucas-Tomasi Tracker and Denavit-Hartenberg-Based Kinematic Model.使用优化的 Kaneda-Lucas-Tomasi 跟踪器和基于 Denavit-Hartenberg 的运动学模型捕获具身对话代理的会话手势。
Sensors (Basel). 2022 Oct 29;22(21):8318. doi: 10.3390/s22218318.
2
Learning to generate pointing gestures in situated embodied conversational agents.学习在情境化具身对话代理中生成指示性手势。
Front Robot AI. 2023 Mar 30;10:1110534. doi: 10.3389/frobt.2023.1110534. eCollection 2023.
3
Automating the Production of Communicative Gestures in Embodied Characters.在具身角色中实现交际手势的自动化生成。
Front Psychol. 2018 Jul 9;9:1144. doi: 10.3389/fpsyg.2018.01144. eCollection 2018.
4
Atypicalities of Gesture Form and Function in Autistic Adults.自闭症成人手势形式和功能的非典型性。
J Autism Dev Disord. 2019 Apr;49(4):1438-1454. doi: 10.1007/s10803-018-3829-x.
5
Modulating the assessment of semantic speech-gesture relatedness via transcranial direct current stimulation of the left frontal cortex.通过左额叶皮层经颅直流电刺激调节语义言语手势关联性评估。
Brain Stimul. 2017 Mar-Apr;10(2):223-230. doi: 10.1016/j.brs.2016.10.012. Epub 2016 Oct 25.
6
A Low-Cost Wearable Hand Gesture Detecting System Based on IMU and Convolutional Neural Network.基于惯性测量单元和卷积神经网络的低成本可穿戴手势检测系统。
Annu Int Conf IEEE Eng Med Biol Soc. 2021 Nov;2021:6999-7002. doi: 10.1109/EMBC46164.2021.9630686.
7
Evaluation of text-to-gesture generation model using convolutional neural network.基于卷积神经网络的文本到手势生成模型评估。
Neural Netw. 2022 Jul;151:365-375. doi: 10.1016/j.neunet.2022.03.041. Epub 2022 Apr 4.
8
Gesture-speech synchrony in schizophrenia: A pilot study using a kinematic-acoustic analysis.精神分裂症中的手势-言语同步性:一项使用运动学-声学分析的初步研究。
Neuropsychologia. 2022 Sep 9;174:108347. doi: 10.1016/j.neuropsychologia.2022.108347. Epub 2022 Aug 13.
9
Tapping toddlers' evolving semantic representation via gesture.通过手势挖掘幼儿不断发展的语义表征。
J Speech Lang Hear Res. 2007 Jun;50(3):732-45. doi: 10.1044/1092-4388(2007/051).
10
Transcranial Direct Current Stimulation Improves Semantic Speech-Gesture Matching in Patients With Schizophrenia Spectrum Disorder.经颅直流电刺激改善精神分裂谱系障碍患者的语义言语-手势匹配。
Schizophr Bull. 2019 Apr 25;45(3):522-530. doi: 10.1093/schbul/sby144.

引用本文的文献

1
Computer Vision in Human Analysis: From Face and Body to Clothes.计算机视觉在人体分析中的应用:从人脸和人体到衣物。
Sensors (Basel). 2023 Jun 6;23(12):5378. doi: 10.3390/s23125378.
2
LiDAR-Based Maintenance of a Safe Distance between a Human and a Robot Arm.基于激光雷达的人类与机械臂安全距离维护。
Sensors (Basel). 2023 Apr 26;23(9):4305. doi: 10.3390/s23094305.

本文引用的文献

1
Trust and acceptance of a virtual psychiatric interview between embodied conversational agents and outpatients.具身对话代理与门诊患者之间虚拟精神科访谈的信任与接受度。
NPJ Digit Med. 2020 Jan 7;3:2. doi: 10.1038/s41746-019-0213-y. eCollection 2020.
2
Motion capture-based animated characters for the study of speech-gesture integration.基于运动捕捉的动画角色用于研究言语-手势的整合。
Behav Res Methods. 2020 Jun;52(3):1339-1354. doi: 10.3758/s13428-019-01319-w.
3
Development of an 8DOF quadruped robot and implementation of Inverse Kinematics using Denavit-Hartenberg convention.
一款八自由度四足机器人的开发以及使用丹纳维特-哈滕贝格(Denavit-Hartenberg)约定实现逆运动学
Heliyon. 2018 Dec 17;4(12):e01053. doi: 10.1016/j.heliyon.2018.e01053. eCollection 2018 Dec.
4
Communicative intent modulates production and comprehension of actions and gestures: A Kinect study.交际意图调节动作和手势的产生和理解:一项 Kinect 研究。
Cognition. 2018 Nov;180:38-51. doi: 10.1016/j.cognition.2018.04.003. Epub 2018 Jul 5.
5
Motion artifact detection and correction in functional near-infrared spectroscopy: a new hybrid method based on spline interpolation method and Savitzky-Golay filtering.功能近红外光谱中的运动伪影检测与校正:一种基于样条插值法和Savitzky-Golay滤波的新型混合方法
Neurophotonics. 2018 Jan;5(1):015003. doi: 10.1117/1.NPh.5.1.015003. Epub 2018 Feb 8.
6
Virtual Character Animation Based on Affordable Motion Capture and Reconfigurable Tangible Interfaces.基于经济实惠的运动捕捉和可重构有形界面的虚拟角色动画。
IEEE Trans Vis Comput Graph. 2018 May;24(5):1742-1755. doi: 10.1109/TVCG.2017.2690433. Epub 2017 Apr 3.
7
Two sides of the same coin: speech and gesture mutually interact to enhance comprehension.硬币的两面:言语和手势相互作用以增强理解。
Psychol Sci. 2010 Feb;21(2):260-7. doi: 10.1177/0956797609357327. Epub 2009 Dec 22.