用于在平稳模型模拟环境中对捕食进行建模的随机自动机的强化学习。

Reinforcement learning for a stochastic automaton modelling predation in stationary model-mimic environments.

作者信息

Tsoularis A, Wallace J

机构信息

Institute of Information and Mathematical Sciences, Massey University, Albany, P.O. Box 102 904, Auckland, New Zealand.

出版信息

Math Biosci. 2005 May;195(1):76-91. doi: 10.1016/j.mbs.2005.01.003.

DOI:10.1016/j.mbs.2005.01.003

PMID:15893338

Abstract

In this paper we propose a mathematical learning model for the feeding behaviour of a specialist predator operating in a random environment occupied by two types of prey, palatable mimics and unpalatable models, and a generalist predator with additional alternative prey at its disposal. A well known linear reinforcement learning algorithm and its special cases are considered for updating the probabilities of the two actions, eat prey or ignore prey. Each action elicits a probabilistic response from the environment that can be favorable or unfavourable. To assess the performance of the predator a payoff function is constructed that captures the energetic benefit from consuming acceptable prey, the energetic cost from consuming unacceptable prey, and lost benefit from ignoring acceptable prey. Conditions for an improving predator payoff are also explicitly formulated.

摘要

在本文中，我们提出了一种数学学习模型，用于研究在随机环境中捕食的专业捕食者的摄食行为。该随机环境中有两种猎物，即可口的拟态者和不可口的模型，以及一种有额外替代猎物可供选择的泛化捕食者。我们考虑了一种著名的线性强化学习算法及其特殊情况，用于更新两种行为（吃猎物或忽略猎物）的概率。每种行为都会引发环境的概率性反应，这种反应可能是有利的，也可能是不利的。为了评估捕食者的表现，我们构建了一个收益函数，该函数捕捉了从消耗可接受猎物中获得的能量收益、从消耗不可接受猎物中产生的能量成本，以及因忽略可接受猎物而损失的收益。还明确制定了捕食者收益提高的条件。