Laboratoire de psychologie des Pays de la Loire (LPPL UR 4638), Nantes Université, 44000, Nantes, France.
Institut Universitaire de France, Paris, France.
Behav Res Methods. 2023 Jun;55(4):2021-2036. doi: 10.3758/s13428-022-01908-2. Epub 2022 Jul 6.
For researchers and psychologists interested in estimating a subject's memory capacity, the current standard for scoring memory span tasks is the partial-credit method: subjects are credited with the number of stimuli that they manage to recall correctly in the correct serial position. A critical issue with this method, however, is that intrusions and omissions can radically change the scores depending on where they occur. For example, when recalling the sequence ABCDE, "ABCD" is worth 4 points but "BCDE" is worth 0 points. This paper presents an improved scoring method based on the edit distance, meaning the number of changes required to edit the recalled sequence into the target. Edit-distance scoring gives results close to partial-credit scoring, but without the corresponding vulnerability to positional shifts. A reanalysis of memory performance in two large datasets (N = 1093 and N = 758) confirms that in addition to being more logically consistent, edit-distance scoring demonstrates similar or better psychometric properties than partial-credit, with comparable validity, a small increase in reliability, and a substantial increase of test information (measurement precision in the context of item response theory). Test information was especially improved for harder items and for subjects with ability in the lower range, whose scores tend to be severely underestimated by partial-credit scoring. Code to compute edit-distance scores with various software is made available at https://osf.io/wdb83/ .
对于有兴趣估计被试记忆容量的研究人员和心理学家来说,目前用于评分记忆广度任务的标准是部分信用评分法:根据被试正确回忆的刺激数量及其在正确序列位置上的正确程度给予分数。然而,这种方法的一个关键问题是,侵入和遗漏会根据它们出现的位置而极大地改变分数。例如,在回忆序列 ABCDE 时,“ABCD”得 4 分,但“BCDE”得 0 分。本文提出了一种基于编辑距离的改进评分方法,即编辑距离是指将回忆序列编辑成目标序列所需的更改数量。编辑距离评分的结果与部分信用评分相似,但没有相应的位置变化的脆弱性。对两个大型数据集(N=1093 和 N=758)的记忆性能的重新分析证实,除了更符合逻辑一致性外,编辑距离评分还表现出与部分信用评分相似或更好的心理测量特性,具有可比性的有效性、可靠性的小幅度提高,以及测试信息量的显著增加(在项目反应理论背景下的测量精度)。编辑距离评分尤其提高了难度较大的项目和能力处于较低范围的被试的测试信息量,部分信用评分严重低估了这些被试的分数。可在 https://osf.io/wdb83/ 获得使用各种软件计算编辑距离分数的代码。