Department of Curriculum & Instruction, Texas State University, San Marcos, TX, 78666, USA.
Department of Education and Human Services, Lehigh University, Bethlehem, PA, USA.
Ann Dyslexia. 2021 Jul;71(2):238-259. doi: 10.1007/s11881-020-00212-y. Epub 2021 Jan 13.
This study investigated the dependability of reading comprehension scores across different text genres and response formats for readers with varied language knowledge. Participants included 78 fourth-graders in an urban elementary school. A randomized and counterbalanced 3 × 2 study design investigated three response formats (open-ended, multiple-choice, retell) and two text genres (narrative, expository) from the Qualitative Reading Inventory (QRI-5) reading comprehension test. Standardized language knowledge measures from the Woodcock Johnson III Tests of Achievement (Academic Knowledge, Oral Comprehension, Picture Vocabulary) defined three reader profiles: (a) < 90 as emerging, (b) 90-100 as basic, and (c) > 100 as proficient. Generalizability studies partitioned variance in scores for reader, text genre, and response format for all three groups. Response format accounted for 42.8 to 62.4% of variance in reading comprehension scores across groups, whereas text genre accounted for very little variance (1.2-4.1%). Single scores were well below a 0.80 dependability threshold (absolute phi coefficients = 0.06-0.14). Decision studies projecting dependability achieved with additional scores varied by response format for each language knowledge group, with very low projected dependability on open-ended and multiple-choice scores for readers with basic language knowledge. Multiple-choice scores had similarly low projected dependability levels for readers with emerging language knowledge. Findings evidence interactions between reader language knowledge and response format in reading comprehension assessment practices. Implications underscore the limitations of using a single score to classify readers with and without proficiency in foundational skills.
本研究调查了不同语言知识水平的读者在不同文本类型和反应格式下阅读理解分数的可靠性。参与者包括一所城市小学的 78 名四年级学生。采用随机和平衡的 3 × 2 研究设计,研究了来自定性阅读inventory(QRI-5)阅读理解测试的三种反应格式(开放式、多项选择、复述)和两种文本类型(叙述性、说明性)。伍德科克·约翰逊三世成就测试(学术知识、口语理解、图片词汇)的标准化语言知识测量将读者分为三组:(a)<90 为新兴,(b)90-100 为基础,(c)>100 为熟练。概括性研究将读者、文本类型和所有三组的反应格式的分数差异进行了划分。反应格式占阅读理解分数方差的 42.8%至 62.4%,而文本类型仅占很小的方差(1.2%至 4.1%)。单项分数远低于 0.80 的可靠性阈值(绝对 phi 系数=0.06-0.14)。预测额外分数可靠性的决策研究因语言知识组而异,对于具有基本语言知识的读者,开放式和多项选择分数的预测可靠性非常低。对于具有新兴语言知识的读者,多项选择分数的预测可靠性水平也同样较低。研究结果表明,在阅读理解评估实践中,读者的语言知识与反应格式之间存在相互作用。研究结果强调了使用单一分数对具有和不具有基础技能熟练程度的读者进行分类的局限性。