Kiebel Stefan J, von Kriegstein Katharina, Daunizeau Jean, Friston Karl J
Wellcome Trust Centre for Neuroimaging, London, UK.
PLoS Comput Biol. 2009 Aug;5(8):e1000464. doi: 10.1371/journal.pcbi.1000464. Epub 2009 Aug 14.
The brain's decoding of fast sensory streams is currently impossible to emulate, even approximately, with artificial agents. For example, robust speech recognition is relatively easy for humans but exceptionally difficult for artificial speech recognition systems. In this paper, we propose that recognition can be simplified with an internal model of how sensory input is generated, when formulated in a Bayesian framework. We show that a plausible candidate for an internal or generative model is a hierarchy of 'stable heteroclinic channels'. This model describes continuous dynamics in the environment as a hierarchy of sequences, where slower sequences cause faster sequences. Under this model, online recognition corresponds to the dynamic decoding of causal sequences, giving a representation of the environment with predictive power on several timescales. We illustrate the ensuing decoding or recognition scheme using synthetic sequences of syllables, where syllables are sequences of phonemes and phonemes are sequences of sound-wave modulations. By presenting anomalous stimuli, we find that the resulting recognition dynamics disclose inference at multiple time scales and are reminiscent of neuronal dynamics seen in the real brain.
目前,即使是近似地,人工智能体也无法模拟大脑对快速感官信息流的解码。例如,稳健的语音识别对人类来说相对容易,但对人工语音识别系统来说却异常困难。在本文中,我们提出,当在贝叶斯框架中进行表述时,利用关于感官输入如何产生的内部模型可以简化识别过程。我们表明,一个合理的内部或生成模型候选者是“稳定异宿通道”的层次结构。该模型将环境中的连续动态描述为序列的层次结构,其中较慢的序列引发较快的序列。在这个模型下,在线识别对应于因果序列的动态解码,从而在多个时间尺度上给出具有预测能力的环境表征。我们使用音节的合成序列来说明由此产生的解码或识别方案,其中音节是音素序列,音素是声波调制序列。通过呈现异常刺激,我们发现由此产生的识别动态揭示了多个时间尺度上的推理,并且让人联想到在真实大脑中看到的神经元动态。