基于Metropolis-Hastings命名博弈的多模态深度生成模型的紧急通信

Emergent communication of multimodal deep generative models based on Metropolis-Hastings naming game.

作者信息

Hoang Nguyen Le, Taniguchi Tadahiro, Hagiwara Yoshinobu, Taniguchi Akira

机构信息

Graduate School of Information Science and Engineering, Ritsumeikan University, Kusatsu, Shiga, Japan.

College of Information Science and Engineering, Ritsumeikan University, Kusatsu, Shiga, Japan.

出版信息

Front Robot AI. 2024 Jan 31;10:1290604. doi: 10.3389/frobt.2023.1290604. eCollection 2023.

DOI:10.3389/frobt.2023.1290604

PMID:38356917

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10864618/

Abstract

Deep generative models (DGM) are increasingly employed in emergent communication systems. However, their application in multimodal data contexts is limited. This study proposes a novel model that combines multimodal DGM with the Metropolis-Hastings (MH) naming game, enabling two agents to focus jointly on a shared subject and develop common vocabularies. The model proves that it can handle multimodal data, even in cases of missing modalities. Integrating the MH naming game with multimodal variational autoencoders (VAE) allows agents to form perceptual categories and exchange signs within multimodal contexts. Moreover, fine-tuning the weight ratio to favor a modality that the model could learn and categorize more readily improved communication. Our evaluation of three multimodal approaches - mixture-of-experts (MoE), product-of-experts (PoE), and mixture-of-product-of-experts (MoPoE)-suggests an impact on the creation of latent spaces, the internal representations of agents. Our results from experiments with the MNIST + SVHN and Multimodal165 datasets indicate that combining the Gaussian mixture model (GMM), PoE multimodal VAE, and MH naming game substantially improved information sharing, knowledge formation, and data reconstruction.

摘要

深度生成模型（DGM）在新兴通信系统中的应用越来越广泛。然而，它们在多模态数据环境中的应用却很有限。本研究提出了一种新颖的模型，该模型将多模态DGM与梅特罗波利斯-黑斯廷斯（MH）命名游戏相结合，使两个智能体能够共同专注于一个共享主题并发展共同的词汇。该模型证明，即使在存在缺失模态的情况下，它也能够处理多模态数据。将MH命名游戏与多模态变分自编码器（VAE）相结合，使智能体能够在多模态环境中形成感知类别并交换符号。此外，微调权重比以偏向模型能够更轻松学习和分类的模态可改善通信。我们对三种多模态方法——专家混合（MoE）、专家乘积（PoE）和专家乘积混合（MoPoE）——的评估表明，它们对潜在空间（即智能体的内部表示）的创建有影响。我们使用MNIST + SVHN和Multimodal165数据集进行实验的结果表明，将高斯混合模型（GMM）、PoE多模态VAE和MH命名游戏相结合，显著改善了信息共享、知识形成和数据重建。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eaee/10864618/dd12596dc871/frobt-10-1290604-g001.jpg

相似文献

Emergent communication of multimodal deep generative models based on Metropolis-Hastings naming game.

Front Robot AI. 2024 Jan 31;10:1290604. doi: 10.3389/frobt.2023.1290604. eCollection 2023.

Recursive Metropolis-Hastings naming game: symbol emergence in a multi-agent system based on probabilistic generative models.

Front Artif Intell. 2023 Oct 18;6:1229127. doi: 10.3389/frai.2023.1229127. eCollection 2023.

Metropolis-Hastings algorithm in joint-attention naming game: experimental semiotics study.

Front Artif Intell. 2023 Dec 5;6:1235231. doi: 10.3389/frai.2023.1235231. eCollection 2023.

Symbol Emergence as an Interpersonal Multimodal Categorization.

Front Robot AI. 2019 Dec 10;6:134. doi: 10.3389/frobt.2019.00134. eCollection 2019.

A multimodal dynamical variational autoencoder for audiovisual speech representation learning.

Neural Netw. 2024 Apr;172:106120. doi: 10.1016/j.neunet.2024.106120. Epub 2024 Jan 11.

Mixture-of-Experts Variational Autoencoder for clustering and generating from similarity-based representations on single cell data.

PLoS Comput Biol. 2021 Jun 30;17(6):e1009086. doi: 10.1371/journal.pcbi.1009086. eCollection 2021 Jun.

Collective predictive coding hypothesis: symbol emergence as decentralized Bayesian inference.

Front Robot AI. 2024 Jul 23;11:1353870. doi: 10.3389/frobt.2024.1353870. eCollection 2024.

Data-Dependent Conditional Priors for Unsupervised Learning of Multimodal Data.

Entropy (Basel). 2020 Aug 13;22(8):888. doi: 10.3390/e22080888.

Leveraging hierarchy in multimodal generative models for effective cross-modality inference.

Neural Netw. 2022 Feb;146:238-255. doi: 10.1016/j.neunet.2021.11.019. Epub 2021 Nov 24.

A mixture-of-experts deep generative model for integrated analysis of single-cell multiomics data.

Cell Rep Methods. 2021 Sep 15;1(5):100071. doi: 10.1016/j.crmeth.2021.100071. eCollection 2021 Sep 27.

引用本文的文献

Collective predictive coding as model of science: formalizing scientific activities towards generative science.

R Soc Open Sci. 2025 Jun 4;12(6):241678. doi: 10.1098/rsos.241678. eCollection 2025 Jun.

Collective predictive coding hypothesis: symbol emergence as decentralized Bayesian inference.

Front Robot AI. 2024 Jul 23;11:1353870. doi: 10.3389/frobt.2024.1353870. eCollection 2024.

Metropolis-Hastings algorithm in joint-attention naming game: experimental semiotics study.

Front Artif Intell. 2023 Dec 5;6:1235231. doi: 10.3389/frai.2023.1235231. eCollection 2023.

Recursive Metropolis-Hastings naming game: symbol emergence in a multi-agent system based on probabilistic generative models.

Front Artif Intell. 2023 Oct 18;6:1229127. doi: 10.3389/frai.2023.1229127. eCollection 2023.

本文引用的文献

Metropolis-Hastings algorithm in joint-attention naming game: experimental semiotics study.

Front Artif Intell. 2023 Dec 5;6:1235231. doi: 10.3389/frai.2023.1235231. eCollection 2023.

Recursive Metropolis-Hastings naming game: symbol emergence in a multi-agent system based on probabilistic generative models.

Front Artif Intell. 2023 Oct 18;6:1229127. doi: 10.3389/frai.2023.1229127. eCollection 2023.

Communicating artificial neural networks develop efficient color-naming systems.

Proc Natl Acad Sci U S A. 2021 Mar 23;118(12). doi: 10.1073/pnas.2016569118.

Symbol Emergence as an Interpersonal Multimodal Categorization.

Front Robot AI. 2019 Dec 10;6:134. doi: 10.3389/frobt.2019.00134. eCollection 2019.

Emergent linguistic structure in artificial neural networks trained by self-supervision.

Proc Natl Acad Sci U S A. 2020 Dec 1;117(48):30046-30054. doi: 10.1073/pnas.1907367117. Epub 2020 Jun 3.

SERKET: An Architecture for Connecting Stochastic Models to Realize a Large-Scale Cognitive Model.

Front Neurorobot. 2018 Jun 26;12:25. doi: 10.3389/fnbot.2018.00025. eCollection 2018.

Multimodal Machine Learning: A Survey and Taxonomy.

IEEE Trans Pattern Anal Mach Intell. 2019 Feb;41(2):423-443. doi: 10.1109/TPAMI.2018.2798607. Epub 2018 Jan 25.

A cluster separation measure.

IEEE Trans Pattern Anal Mach Intell. 1979 Feb;1(2):224-7.

How to reach linguistic consensus: a proof of convergence for the naming game.

J Theor Biol. 2006 Oct 21;242(4):818-31. doi: 10.1016/j.jtbi.2006.05.024. Epub 2006 Jun 7.

Evolving grounded communication for robots.

Trends Cogn Sci. 2003 Jul;7(7):308-312. doi: 10.1016/s1364-6613(03)00129-3.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于Metropolis-Hastings命名博弈的多模态深度生成模型的紧急通信

Emergent communication of multimodal deep generative models based on Metropolis-Hastings naming game.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献