Shang Yuxin, Aryadoust Vahid, Hou Zhuohan
National Institute of Education, Nanyang Technological University, Singapore 639798, Singapore.
School of International Studies, Zhejiang University, Hangzhou 310058, China.
Brain Sci. 2024 Jul 25;14(8):746. doi: 10.3390/brainsci14080746.
To investigate the reliability of L2 listening tests and explore potential factors affecting the reliability, a reliability generalization (RG) meta-analysis was conducted in the present study. A total number of 122 alpha coefficients of L2 listening tests from 92 published articles were collected and submitted to a linear mixed effects RG analysis. The papers were coded based on a coding scheme consisting of 16 variables classified into three categories: study features, test features, and statistical results. The results showed an average reliability of 0.818 (95% CI: 0.803 to 0.833), with 40% of reliability estimates falling below the lower bound of CI. The presence of publication bias and heterogeneity was found in the reliability of L2 listening tests, indicating that low reliability coefficients were likely omitted from some published studies. In addition, two factors predicting the reliability of L2 listening tests were the number of items and test type (standardized and researcher- or teacher-designed tests). The study also found that reliability is not a moderator of the relationship between L2 listening scores and theoretically relevant constructs. Reliability induction was identified in reporting the reliability of L2 listening tests, too. Implications for researchers and teachers are discussed.
为了调查二语听力测试的可靠性并探索影响其可靠性的潜在因素,本研究进行了一项可靠性概括(RG)元分析。从92篇已发表文章中收集了总共122个二语听力测试的阿尔法系数,并将其提交给线性混合效应RG分析。这些论文根据一个编码方案进行编码,该方案由16个变量组成,分为三类:研究特征、测试特征和统计结果。结果显示平均可靠性为0.818(95%置信区间:0.803至0.833),40%的可靠性估计值低于置信区间下限。在二语听力测试的可靠性方面发现了发表偏倚和异质性,这表明一些已发表研究可能遗漏了低可靠性系数。此外,预测二语听力测试可靠性的两个因素是题目数量和测试类型(标准化测试以及研究者或教师设计的测试)。该研究还发现,可靠性并不是二语听力分数与理论相关构念之间关系的调节变量。在报告二语听力测试的可靠性时也发现了可靠性归纳现象。文中讨论了对研究者和教师的启示。