风险环境中的泛化与搜索。

How do people pursue rewards in risky environments, where some outcomes should be avoided at all costs? We investigate how participant search for spatially correlated rewards in scenarios where one must avoid sampling rewards below a given threshold. This requires not only the balancing of exploration and exploitation, but also reasoning about how to avoid potentially risky areas of the search space. Within risky versions of the spatially correlated multi-armed bandit task, we show that participants' behavior is aligned well with a Gaussian process function learning algorithm, which chooses points based on a safe optimization routine. Moreover, using leave-one-block-out cross-validation, we find that participants adapt their sampling behavior to the riskiness of the task, although the underlying function learning mechanism remains relatively unchanged. These results show that participants can adapt their search behavior to the adversity of the environment and enrich our understanding of adaptive behavior in the face of risk and uncertainty.

人们在风险环境中如何追求奖励，而在这种环境中，有些结果应该不惜一切代价避免？我们研究了参与者在必须避免采样低于给定阈值的奖励的场景中如何搜索空间相关奖励。这不仅需要平衡探索和利用，还需要推理如何避免搜索空间中潜在的危险区域。在风险版本的空间相关多臂老虎机任务中，我们表明参与者的行为与高斯过程函数学习算法很好地对齐，该算法根据安全优化例程选择点。此外，通过使用留一区块交叉验证，我们发现参与者根据任务的风险程度调整其采样行为，尽管基础的函数学习机制相对不变。这些结果表明，参与者可以调整他们的搜索行为以适应环境的逆境，并丰富我们对面对风险和不确定性时的自适应行为的理解。

新学期，新优惠

Suppr 超能文献

新学期，新优惠

Suppr 超能文献

Generalization and Search in Risky Environments.

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

推荐工具