Hsiao-Hui Lin, Tzeng Yuh-Tsuen
Yuh-tseun Tzeng, Center for Teacher Education and Institute of Education, National Chung Cheng University, 13-F.-8, No. 56, Sec. 2, Roosevelt Rd., Taipei, 10084, Taiwan,
J Appl Meas. 2018;19(3):320-337.
This study aimed to advance the Scientific Multi-Text Reading Comprehension Assessment (SMTRCA) by developing a rubric which consisted of 4 subscales: information retrieval, information generalization, information interpretation, and information integration. The assessment tool included 11 close-ended and 8 open-ended items and its rubric. Two texts describing opposing views of the dispute of whether to continue the Fourth Nuclear Power Plant construction in Taiwan were developed and 1535 grade 5-9 students read these two texts in a counterbalanced order and answered the test items. First, the results showed that the Cronbach's values were more than .9, indicating very good intra-rater consistency. The Kendall coefficient of concordance of the inter-rater reliability was larger than .8, denoting a consistent scoring pattern between raters. Second, the analysis of many-facet Rasch measurement showed that there were significant difference in rater severity, and both severe and lenient raters could distinguish high versus low-ability students effectively. The comparison of the rating scale model and the partial credit model indicated that each rater had a unique rating scale structure, meaning that the rating procedures involve human interpretation and evaluation during the scoring processes so that it is difficult to reach a machine-like consistency level. However, this is in line with expectations of typical human judgment processes. Third, the Cronbach's coefficient of the full assessment were above .85, denoting that the SMTRCA has high internal-consistency. Finally, confirmatory factory analysis showed that there was an acceptable goodness-of-fit among the SMTRCA. These results suggest that the SMTRCA was a useful tool for measuring multi-text reading comprehension abilities.
本研究旨在通过开发一个由四个子量表组成的评分标准来推进科学多文本阅读理解评估(SMTRCA),这四个子量表分别是:信息检索、信息概括、信息解释和信息整合。该评估工具包括11个封闭式和8个开放式项目及其评分标准。编写了两篇描述台湾是否继续建设第四座核电站争议的对立观点的文本,1535名五至九年级学生以平衡的顺序阅读这两篇文本并回答测试项目。首先,结果表明,克朗巴哈系数值大于0.9,表明评分者内部一致性非常好。评分者间信度的肯德尔和谐系数大于0.8,表示评分者之间的评分模式一致。其次,多面Rasch测量分析表明,评分者的严格程度存在显著差异,严格和宽松的评分者都能有效区分高能力和低能力的学生。评分量表模型和部分计分模型的比较表明,每个评分者都有独特的评分量表结构,这意味着评分过程涉及人类在评分过程中的解释和评估,因此很难达到机器般的一致性水平。然而,这符合典型人类判断过程的预期。第三,整个评估的克朗巴哈系数高于0.85,表明SMTRCA具有较高的内部一致性。最后,验证性因子分析表明,SMTRCA之间存在可接受的拟合优度。这些结果表明,SMTRCA是测量多文本阅读理解能力的有用工具。