Skerritt-Davis Benjamin, Elhilali Mounya
Johns Hopkins University, 3400 N Charles St, Baltimore, MD, USA.
Johns Hopkins University, 3400 N Charles St, Baltimore, MD, USA.
J Neurosci Methods. 2021 Aug 1;360:109177. doi: 10.1016/j.jneumeth.2021.109177. Epub 2021 Apr 9.
The brain tracks sound sources as they evolve in time, collecting contextual information to predict future sensory inputs. Previous work in predictive coding typically focuses on the perception of predictable stimuli, leaving the implementation of these same neural processes in more complex, real-world environments containing randomness and uncertainty up for debate.
To facilitate investigation into the perception of less tightly-controlled listening scenarios, we present a computational model as a tool to ask targeted questions about the underlying predictive processes that connect complex sensory inputs to listener behavior and neural responses. In the modeling framework, observed sound features (e.g. pitch) are tracked sequentially using Bayesian inference. Sufficient statistics are inferred from past observations at multiple time scales and used to make predictions about future observation while tracking the statistical structure of the sensory input.
Facets of the model are discussed in terms of their application to perceptual research, and examples taken from real-world audio demonstrate the model's flexibility to capture a variety of statistical structures along various perceptual dimensions.
Previous models are often targeted toward interpreting a particular experimental paradigm (e.g., oddball paradigm), perceptual dimension (e.g., pitch processing), or task (e.g., speech segregation), thus limiting their ability to generalize to other domains. The presented model is designed as a flexible and practical tool for broad application.
The model is presented as a general framework for generating new hypotheses and guiding investigation into the neural processes underlying predictive coding of complex scenes.
大脑会随着声源随时间的变化而追踪它们,收集上下文信息以预测未来的感官输入。先前关于预测编码的研究通常集中在可预测刺激的感知上,而在包含随机性和不确定性的更复杂的现实世界环境中,这些相同神经过程的实现仍存在争议。
为便于研究对控制较宽松的聆听场景的感知,我们提出一种计算模型,作为一种工具来提出有关将复杂感官输入与听众行为和神经反应联系起来的潜在预测过程的针对性问题。在建模框架中,使用贝叶斯推理顺序跟踪观察到的声音特征(例如音高)。从多个时间尺度上的过去观察中推断出充分统计量,并用于对未来观察进行预测,同时跟踪感官输入的统计结构。
从该模型在感知研究中的应用角度讨论了其各个方面,并且从现实世界音频中选取的示例展示了该模型在各种感知维度上捕获各种统计结构的灵活性。
先前的模型通常旨在解释特定的实验范式(例如,奇偶数范式)、感知维度(例如,音高处理)或任务(例如,语音分离),因此限制了它们推广到其他领域的能力。所提出的模型被设计为一种具有广泛应用的灵活实用工具。
该模型作为一个通用框架被提出,用于生成新的假设并指导对复杂场景预测编码背后的神经过程的研究。