Köstering Lena, Nitschke Kai, Schumacher F Konrad, Weiller Cornelius, Kaller Christoph P
Department of Neurology, University Medical Center, University of Freiburg.
Psychol Assess. 2015 Sep;27(3):925-31. doi: 10.1037/pas0000097. Epub 2015 Mar 30.
Test-retest reliability is difficult to establish for measures of executive functioning that rely on task novelty. Correspondingly, evidence on the test-retest reliability of the commonly used Tower of London (TOL) planning task is, as yet, equivocal and only based on indices of relative consistency, rather than absolute agreement of individual scores. Further, the stability of planning latencies over repeated testing has not been investigated. The present study assessed test-retest reliability of planning performance measures using a structurally balanced problem set implemented in the TOL-Freiburg version (TOL-F). The TOL-F was administered in 2 structurally identical versions to a sample of young, healthy adults over a 1-week interval. For planning accuracy, the Pearson correlation and intraclass correlation coefficient for relative consistency were adequate (r = .739 and .734), with the intraclass correlation coefficient for absolute agreement only slightly decreased (r = .690). For initial thinking and movement execution times, relative consistency and absolute agreement reliability indices were uniformly low (all r between .274 and .519). Given adequate planning accuracy test-retest reliability, the TOL-F can be reliably used to measure planning ability in group-based studies and with individual participants, as is important for clinical testing. Planning latencies, however, should only be used as complementary, but not sole measures of planning ability, particularly for normative evaluations in clinical assessment. In sum, TOL-F planning accuracy possesses adequate absolute and relative test-retest reliability for experimental utility. Future studies should assess whether this indeed translates into clinical utility of the TOL-F for measuring planning ability in patients.
对于依赖任务新颖性的执行功能测量方法而言,重测信度很难确立。相应地,关于常用的伦敦塔(TOL)规划任务重测信度的证据目前尚不明确,且仅基于相对一致性指标,而非个体分数的绝对一致性。此外,尚未对重复测试中规划潜伏期的稳定性进行研究。本研究使用在TOL-弗莱堡版本(TOL-F)中实施的结构平衡问题集评估了规划绩效测量的重测信度。TOL-F以两个结构相同的版本在1周的间隔内施测于年轻健康成年人样本。对于规划准确性,相对一致性的皮尔逊相关系数和组内相关系数足够(r = .739和.734),绝对一致性的组内相关系数仅略有下降(r = .690)。对于初始思考和动作执行时间,相对一致性和绝对一致性信度指标均较低(所有r在.274至.519之间)。鉴于规划准确性有足够的重测信度,TOL-F可可靠地用于基于群体的研究以及个体参与者,这对于临床测试很重要。然而,规划潜伏期仅应用作规划能力的补充指标,而非唯一指标,尤其是在临床评估中的规范性评价时。总之,TOL-F规划准确性在实验效用方面具有足够的绝对和相对重测信度。未来研究应评估这是否确实转化为TOL-F在测量患者规划能力方面的临床效用。