Suppr超能文献

使用基于脉冲的反向传播从时空表示中合成图像。

Synthesizing Images From Spatio-Temporal Representations Using Spike-Based Backpropagation.

作者信息

Roy Deboleena, Panda Priyadarshini, Roy Kaushik

机构信息

Department of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, United States.

出版信息

Front Neurosci. 2019 Jun 18;13:621. doi: 10.3389/fnins.2019.00621. eCollection 2019.

Abstract

Spiking neural networks (SNNs) offer a promising alternative to current artificial neural networks to enable low-power event-driven neuromorphic hardware. Spike-based neuromorphic applications require processing and extracting meaningful information from spatio-temporal data, represented as series of spike trains over time. In this paper, we propose a method to synthesize images from multiple modalities in a spike-based environment. We use spiking auto-encoders to convert image and audio inputs into compact spatio-temporal representations that is then decoded for image synthesis. For this, we use a direct training algorithm that computes loss on the membrane potential of the output layer and back-propagates it by using a sigmoid approximation of the neuron's activation function to enable differentiability. The spiking autoencoders are benchmarked on MNIST and Fashion-MNIST and achieve very low reconstruction loss, comparable to ANNs. Then, spiking autoencoders are trained to learn meaningful spatio-temporal representations of the data, across the two modalities-audio and visual. We synthesize images from audio in a spike-based environment by first generating, and then utilizing such shared multi-modal spatio-temporal representations. Our audio to image synthesis model is tested on the task of converting TI-46 digits audio samples to MNIST images. We are able to synthesize images with high fidelity and the model achieves competitive performance against ANNs.

摘要

脉冲神经网络(SNN)为当前的人工神经网络提供了一种有前景的替代方案,以实现低功耗事件驱动的神经形态硬件。基于脉冲的神经形态应用需要从时空数据中处理和提取有意义的信息,这些数据表示为随时间变化的一系列脉冲序列。在本文中,我们提出了一种在基于脉冲的环境中从多种模态合成图像的方法。我们使用脉冲自动编码器将图像和音频输入转换为紧凑的时空表示,然后对其进行解码以进行图像合成。为此,我们使用一种直接训练算法,该算法计算输出层膜电位上的损失,并通过使用神经元激活函数的 sigmoid 近似来进行反向传播,以实现可微性。脉冲自动编码器在 MNIST 和 Fashion-MNIST 上进行了基准测试,并实现了非常低的重建损失,与人工神经网络相当。然后,训练脉冲自动编码器以学习跨音频和视觉这两种模态的数据的有意义的时空表示。我们通过首先生成然后利用这种共享的多模态时空表示,在基于脉冲的环境中从音频合成图像。我们的音频到图像合成模型在将 TI-46 数字音频样本转换为 MNIST 图像的任务上进行了测试。我们能够以高保真度合成图像,并且该模型在与人工神经网络的比较中取得了有竞争力的性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/872c/6611397/a2538806b5a6/fnins-13-00621-g0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验