反馈延迟：决策者如何学会避免每次车库为空时就购买新车？

Feedback Delays: How Can Decision Makers Learn Not to Buy a New Car Every Time the Garage Is Empty?

作者信息

Gibson FP

机构信息

University of Michigan Business School

出版信息

Organ Behav Hum Decis Process. 2000 Sep;83(1):141-166. doi: 10.1006/obhd.2000.2906.

DOI:10.1006/obhd.2000.2906

PMID:10973786

Abstract

Decision makers in dynamic environments (e.g., stock trading, inventory control, and firefighting) learn poorly in experiments where feedback about the outcomes of their actions is delayed. In searching for ways to mitigate these effects, this paper presents two computational models of learning with feedback delays and contrasts them against human decision-makers' performance. The no-memory model hypothesizes that decision makers always perceive feedback as immediate. The with-memory model hypothesizes that, over time, decision makers are able to develop internal representations of the task that help them to perform with delayed feedback. As borne out by human subjects, both models predict that a display of past history improves learning with delay and that increasing delay increasingly degrades performance. Even though the length of training in this task exceeds that used in many laboratory-based dynamic tasks, neither the two models nor the subjects are able to effectively learn without decision aids when faced with feedback delays. When given an amount of training that more closely approximates that provided in functioning dynamic environments, the with-memory model predicts that human decision makers may learn without decision aids over the long term if feedback delays are simple. These results raise several issues for continued theoretical investigation as well as potential suggestions for training and supporting decision makers in dynamic environments with feedback delays. Copyright 2000 Academic Press.

摘要

在动态环境（如股票交易、库存控制和灭火）中的决策者，在关于其行动结果的反馈被延迟的实验中学习效果不佳。在寻找减轻这些影响的方法时，本文提出了两种具有反馈延迟的学习计算模型，并将它们与人类决策者的表现进行对比。无记忆模型假设决策者总是将反馈视为即时的。有记忆模型假设，随着时间的推移，决策者能够形成任务的内部表征，这有助于他们在延迟反馈的情况下执行任务。正如人类受试者所证实的那样，两种模型都预测，显示过去的历史记录会改善延迟情况下的学习，并且延迟增加会导致表现越来越差。尽管这项任务中的训练时长超过了许多基于实验室的动态任务中的训练时长，但当面临反馈延迟时，如果没有决策辅助工具，这两种模型和受试者都无法有效地学习。当给予更接近实际动态环境中所提供的训练量时，有记忆模型预测，如果反馈延迟简单，人类决策者从长期来看可能无需决策辅助工具就能学习。这些结果为持续的理论研究提出了几个问题，同时也为在有反馈延迟的动态环境中培训和支持决策者提供了潜在建议。版权所有2000年学术出版社。