Savery Richard, Zahray Lisa, Weinberg Gil
Robotic Musicianship Lab, Georgia Tech Center for Music Technology, Atlanta, GA, United States.
Front Robot AI. 2021 Apr 29;8:662355. doi: 10.3389/frobt.2021.662355. eCollection 2021.
Research in creative robotics continues to expand across all creative domains, including art, music and language. Creative robots are primarily designed to be task specific, with limited research into the implications of their design outside their core task. In the case of a musical robot, this includes when a human sees and interacts with the robot before and after the performance, as well as in between pieces. These non-musical interaction tasks such as the presence of a robot during musical equipment set up, play a key role in the human perception of the robot however have received only limited attention. In this paper, we describe a new audio system using emotional musical prosody, designed to match the creative process of a musical robot for use before, between and after musical performances. Our generation system relies on the creation of a custom dataset for musical prosody. This system is designed foremost to operate in real time and allow rapid generation and dialogue exchange between human and robot. For this reason, the system combines symbolic deep learning through a Conditional Convolution Variational Auto-encoder, with an emotion-tagged audio sampler. We then compare this to a SOTA text-to-speech system in our robotic platform, Shimon the marimba player.We conducted a between-groups study with 100 participants watching a musician interact for 30 s with Shimon. We were able to increase user ratings for the key creativity metrics; novelty and coherence, while maintaining ratings for expressivity across each implementation. Our results also indicated that by communicating in a form that relates to the robot's core functionality, we can raise likeability and perceived intelligence, while not altering animacy or anthropomorphism. These findings indicate the variation that can occur in the perception of a robot based on interactions surrounding a performance, such as initial meetings and spaces between pieces, in addition to the core creative algorithms.
创意机器人的研究在包括艺术、音乐和语言在内的所有创意领域持续扩展。创意机器人主要设计为特定任务型,对其核心任务之外的设计影响研究有限。就音乐机器人而言,这包括人类在表演前后以及表演间隙与机器人的见面和互动。这些非音乐互动任务,比如在音乐设备搭建过程中机器人的存在,在人类对机器人的认知中起着关键作用,但却只受到了有限的关注。在本文中,我们描述了一种使用情感音乐韵律的新音频系统,旨在匹配音乐机器人在音乐表演前、表演间隙和表演后的创意过程。我们的生成系统依赖于创建一个用于音乐韵律的定制数据集。该系统首要设计目标是实时运行,并允许人与机器人之间进行快速生成和对话交流。因此,该系统通过条件卷积变分自编码器将符号深度学习与一个带有情感标签的音频采样器相结合。然后我们在我们的机器人平台——马林巴琴演奏者西蒙上,将其与一个最优文本转语音系统进行比较。我们进行了一项分组研究,100名参与者观看一名音乐家与西蒙进行30秒的互动。我们能够提高关键创造力指标的用户评分;新颖性和连贯性,同时在每个实施方案中保持表达性的评分。我们的结果还表明,通过以与机器人核心功能相关的形式进行交流,我们可以提高喜爱度和感知到的智能,同时不改变生动性或拟人化。这些发现表明,除了核心创意算法之外,基于围绕表演的互动,如初次见面和表演间隙,对机器人的认知可能会出现差异。