Suppr超能文献

觅食决策作为多臂赌博机问题:将强化学习算法应用于觅食数据。

Foraging decisions as multi-armed bandit problems: Applying reinforcement learning algorithms to foraging data.

机构信息

Department of Biological Sciences, Macquarie University, North Ryde, NSW 2109, Australia; Programa de Pós-Graduação em Ecologia e Conservação, Federal University of Paraná, Curitiba, Brazil, 19031, CEP, 81531-990.

出版信息

J Theor Biol. 2019 Apr 21;467:48-56. doi: 10.1016/j.jtbi.2019.02.002. Epub 2019 Feb 6.

Abstract

Finding resources is crucial for animals to survive and reproduce, but the understanding of the decision-making underlying foraging decisions to explore new resources and exploit old resources remains lacking. Theory predicts an 'exploration-exploitation trade-off' where animals must balance their effort into either stay and exploit a seemingly good resource or move and explore the environment. To date, however, it has been challenging to generate flexible yet tractable statistical models that can capture this trade-off, and our understanding of foraging decisions is limited. Here, I suggest that foraging decisions can be seen as multi-armed bandit problems, and apply deterministic (i.e., the Upper-Confidence-Bound or 'UCB') and Bayesian algorithms (i.e., Thompson Sampling or 'TS') to demonstrate how these algorithms generate testable a priori predictions from simulated data. Next, I use UCB and TS to analyse empirical foraging data from the tephritid fruit fly larvae Bactrocera tryoni to provide a qualitative and quantitative framework to quantify animal foraging behaviour. Qualitative analysis revealed that TS display shorter exploration period than UCB, although both converged to similar qualitative results. Quantitative analysis demonstrated that, overall, UCB is more accurate in predicting the observed foraging patterns compared with TS, even though both algorithms failed to quantitatively estimate the empirical foraging patterns in high-density groups (i.e., groups with 50 larvae and, more strikingly, groups with 100 larvae), likely due to the influence of intraspecific competition on animal behaviour. The framework proposed here demonstrates how reinforcement learning algorithms can be used to model animal foraging decisions.

摘要

寻找资源对动物的生存和繁殖至关重要,但对于探索新资源和利用旧资源的觅食决策背后的决策制定,人们的理解仍然不足。理论预测存在“探索-开发权衡”,动物必须平衡其努力,要么留在看似好的资源上开发,要么移动并探索环境。然而,迄今为止,生成能够捕捉这种权衡的灵活且易于处理的统计模型一直具有挑战性,并且我们对觅食决策的理解也有限。在这里,我建议将觅食决策视为多臂老虎机问题,并应用确定性(即上置信界或“UCB”)和贝叶斯算法(即汤普森抽样或“TS”),以展示这些算法如何从模拟数据中生成可测试的先验预测。接下来,我使用 UCB 和 TS 来分析来自桔小实蝇幼虫 Bactrocera tryoni 的经验觅食数据,为量化动物觅食行为提供定性和定量框架。定性分析表明,TS 的探索期比 UCB 短,尽管两者都收敛到类似的定性结果。定量分析表明,总体而言,UCB 比 TS 更准确地预测观察到的觅食模式,尽管两种算法都未能定量估计高密度组(即每组 50 个幼虫,更引人注目,每组 100 个幼虫)中的经验觅食模式,可能是由于种内竞争对动物行为的影响。这里提出的框架展示了强化学习算法如何用于模拟动物的觅食决策。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验