J Exp Anal Behav. 1967 Jan;10(1):57-65. doi: 10.1901/jeab.1967.10-57.
When a pigeon's choices between two keys are probabilistically reinforced, as in discrete trial probability learning procedures and in concurrent variable-interval schedules, the bird tends to maximize, or to choose the alternative with the higher probability of reinforcement. In concurrent variable-interval schedules, steady-state matching, which is an approximate equality between the relative frequency of a response and the relative frequency of reinforcement of that response, has previously been obtained only as a consequence of maximizing. In the present experiment, maximizing was impossible. A choice of one of two keys was reinforced only if it formed, together with the three preceding choices, the sequence of four successive choices that had occurred least often. This sequence was determined by a Bernoulli-trials process with parameter p. Each of three pigeons matched when p was (1/2) or (1/4). Therefore, steady-state matching by individual birds is not always a consequence of maximizing. Choice probability varied between successive reinforcements, and sequential statistics revealed dependencies which were adequately described by a Bernoulli-trials process with p depending on the time since the preceding reinforcement.
当鸽子在两个键之间的选择是概率强化时,例如在离散试验概率学习程序和同时变时距程序中,鸟类往往会最大化,或者选择强化概率更高的选择。在同时变时距程序中,稳态匹配,即响应的相对频率与该响应的相对强化频率之间的近似相等,以前仅作为最大化的结果获得。在本实验中,最大化是不可能的。只有当选择两个键中的一个与之前的三个选择一起形成了出现次数最少的四个连续选择序列时,该选择才会得到强化。该序列由参数为 p 的伯努利试验过程决定。当 p 为 (1/2) 或 (1/4) 时,三只鸽子中的每一只都匹配。因此,个别鸟类的稳态匹配并不总是最大化的结果。选择概率在连续强化之间变化,序列统计揭示了依赖性,这些依赖性可以通过依赖于上次强化后时间的具有 p 的伯努利试验过程来充分描述。