Hoang Nguyen Le, Taniguchi Tadahiro, Hagiwara Yoshinobu, Taniguchi Akira
Graduate School of Information Science and Engineering, Ritsumeikan University, Kusatsu, Shiga, Japan.
College of Information Science and Engineering, Ritsumeikan University, Kusatsu, Shiga, Japan.
Front Robot AI. 2024 Jan 31;10:1290604. doi: 10.3389/frobt.2023.1290604. eCollection 2023.
Deep generative models (DGM) are increasingly employed in emergent communication systems. However, their application in multimodal data contexts is limited. This study proposes a novel model that combines multimodal DGM with the Metropolis-Hastings (MH) naming game, enabling two agents to focus jointly on a shared subject and develop common vocabularies. The model proves that it can handle multimodal data, even in cases of missing modalities. Integrating the MH naming game with multimodal variational autoencoders (VAE) allows agents to form perceptual categories and exchange signs within multimodal contexts. Moreover, fine-tuning the weight ratio to favor a modality that the model could learn and categorize more readily improved communication. Our evaluation of three multimodal approaches - mixture-of-experts (MoE), product-of-experts (PoE), and mixture-of-product-of-experts (MoPoE)-suggests an impact on the creation of latent spaces, the internal representations of agents. Our results from experiments with the MNIST + SVHN and Multimodal165 datasets indicate that combining the Gaussian mixture model (GMM), PoE multimodal VAE, and MH naming game substantially improved information sharing, knowledge formation, and data reconstruction.
深度生成模型(DGM)在新兴通信系统中的应用越来越广泛。然而,它们在多模态数据环境中的应用却很有限。本研究提出了一种新颖的模型,该模型将多模态DGM与梅特罗波利斯-黑斯廷斯(MH)命名游戏相结合,使两个智能体能够共同专注于一个共享主题并发展共同的词汇。该模型证明,即使在存在缺失模态的情况下,它也能够处理多模态数据。将MH命名游戏与多模态变分自编码器(VAE)相结合,使智能体能够在多模态环境中形成感知类别并交换符号。此外,微调权重比以偏向模型能够更轻松学习和分类的模态可改善通信。我们对三种多模态方法——专家混合(MoE)、专家乘积(PoE)和专家乘积混合(MoPoE)——的评估表明,它们对潜在空间(即智能体的内部表示)的创建有影响。我们使用MNIST + SVHN和Multimodal165数据集进行实验的结果表明,将高斯混合模型(GMM)、PoE多模态VAE和MH命名游戏相结合,显著改善了信息共享、知识形成和数据重建。