文献检索，用中文搜 PubMed

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

Graduate School of Brain Sciences, Tamagawa University, Tokyo, Japan.

Neural Netw. 2012 Nov;35:88-91. doi: 10.1016/j.neunet.2012.08.004. Epub 2012 Aug 24.

The impulsive preference of an animal for an immediate reward implies that it might subjectively discount the value of potential future outcomes. A theoretical framework to maximize the discounted subjective value has been established in the reinforcement learning theory. The framework has been successfully applied in engineering. However, this study identified a limitation when applied to animal behavior, where in some cases, there is no learning goal. Here a possible learning framework was proposed that is well-posed in any cases and that is consistent with the impulsive preference.

动物对即时奖励的冲动偏好意味着它可能会主观地低估潜在未来结果的价值。强化学习理论中已经建立了一个最大化折扣主观价值的理论框架。该框架已成功应用于工程领域。然而，当应用于动物行为时，该研究发现了一个限制，即某些情况下，没有学习目标。这里提出了一个可能的学习框架，它在任何情况下都是有解的，并且与冲动偏好一致。

Graduate School of Brain Sciences, Tamagawa University, Tokyo, Japan.

Neural Netw. 2012 Nov;35:88-91. doi: 10.1016/j.neunet.2012.08.004. Epub 2012 Aug 24.

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

折扣价值的强化学习在应用于动物学习时经常会失去目标。

Reinforcement learning for discounted values often loses the goal in the application to animal learning.

机构信息

出版信息

相似文献

折扣价值的强化学习在应用于动物学习时经常会失去目标。

Reinforcement learning for discounted values often loses the goal in the application to animal learning.

机构信息

出版信息

相似文献