Alcalá-Quintana Rocío, García-Pérez Miguel A
Departamento de Metodología, Facultad de Psicología, Universidad Complutense, Campus de Somosaguas, 28223 Madrid, Spain.
Spat Vis. 2005;18(3):347-74. doi: 10.1163/1568568054089375.
Threshold estimation with sequential procedures is justifiable on the surmise that the index used in the so-called dynamic stopping rule has diagnostic value for identifying when an accurate estimate has been obtained. The performance of five types of Bayesian sequential procedure was compared here to that of an analogous fixed-length procedure. Indices for use in sequential procedures were: (1) the width of the Bayesian probability interval, (2) the posterior standard deviation, (3) the absolute change, (4) the average change, and (5) the number of sign fluctuations. A simulation study was carried out to evaluate which index renders estimates with less bias and smaller standard error at lower cost (i.e. lower average number of trials to completion), in both yes-no and two-alternative forced-choice (2AFC) tasks. We also considered the effect of the form and parameters of the psychometric function and its similarity with the model function assumed in the procedure. Our results show that sequential procedures do not outperform fixed-length procedures in yes-no tasks. However, in 2AFC tasks, sequential procedures not based on sign fluctuations all yield minimally better estimates than fixed-length procedures, although most of the improvement occurs with short runs that render undependable estimates and the differences vanish when the procedures run for a number of trials (around 70) that ensures dependability. Thus, none of the indices considered here (some of which are widespread) has the diagnostic value that would justify its use. In addition, difficulties of implementation make sequential procedures unfit as alternatives to fixed-length procedures.
基于这样一种推测,即所谓动态停止规则中使用的指标对于确定何时获得准确估计具有诊断价值,使用序贯程序进行阈值估计是合理的。本文将五种贝叶斯序贯程序的性能与类似的固定长度程序的性能进行了比较。序贯程序中使用的指标有:(1)贝叶斯概率区间的宽度,(2)后验标准差,(3)绝对变化,(4)平均变化,以及(5)符号波动次数。进行了一项模拟研究,以评估在是/否任务和二项迫选(2AFC)任务中,哪种指标能以更低的成本(即更低的平均试验完成次数)提供偏差更小、标准误差更小的估计。我们还考虑了心理测量函数的形式和参数的影响及其与程序中假设的模型函数的相似性。我们的结果表明,在是/否任务中,序贯程序并不优于固定长度程序。然而,在2AFC任务中,不基于符号波动的序贯程序都能产生比固定长度程序略好的估计,尽管大部分改进出现在产生不可靠估计的短运行中,并且当程序运行一定次数(约70次)以确保可靠性时,差异就会消失。因此,这里考虑的指标(其中一些很普遍)都没有能证明其使用合理性的诊断价值。此外,实施上的困难使得序贯程序不适宜作为固定长度程序的替代方案。