课堂观察分数趋势

Trends in Classroom Observation Scores.

作者信息

Casabianca Jodi M, Lockwood J R, McCaffrey Daniel F

机构信息

The University of Texas at Austin, Austin, TX, USA.

Educational Testing Service, Princeton, NJ, USA.

出版信息

Educ Psychol Meas. 2015 Apr;75(2):311-337. doi: 10.1177/0013164414539163. Epub 2014 Jun 22.

DOI:10.1177/0013164414539163

PMID:29795823

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5965595/

Abstract

Observations and ratings of classroom teaching and interactions collected over time are susceptible to trends in both the quality of instruction and rater behavior. These trends have potential implications for inferences about teaching and for study design. We use scores on the Classroom Assessment Scoring System-Secondary (CLASS-S) protocol from 458 middle school teachers over a 2-year period to study changes over time in (a) the average quality of teaching for the population of teachers, (b) the average severity of the population of raters, and (c) the severity of individual raters. To obtain these estimates and assess them in the context of other factors that contribute to the variability in scores, we develop an augmented G study model that is broadly applicable for modeling sources of variability in classroom observation ratings data collected over time. In our data, we found that trends in teaching quality were small. Rater drift was very large during raters' initial days of observation and persisted throughout nearly 2 years of scoring. Raters did not converge to a common level of severity; using our model we estimate that variability among raters actually increases over the course of the study. Variance decompositions based on the model find that trends are a modest source of variance relative to overall rater effects, rater errors on specific lessons, and residual error. The discussion provides possible explanations for trends and rater divergence as well as implications for designs collecting ratings over time.

摘要

随着时间的推移收集的课堂教学及互动的观察结果和评分，容易受到教学质量和评分者行为趋势的影响。这些趋势对教学推断和研究设计具有潜在影响。我们使用458名中学教师在两年时间内的课堂评估评分系统-中学版（CLASS-S）协议得分，来研究以下方面随时间的变化：（a）教师群体的平均教学质量，（b）评分者群体的平均严格程度，以及（c）个体评分者的严格程度。为了获得这些估计值并在导致分数变异性的其他因素背景下对其进行评估，我们开发了一个扩展的G研究模型，该模型广泛适用于对随时间收集的课堂观察评分数据中的变异性来源进行建模。在我们的数据中，我们发现教学质量的趋势很小。评分者在观察初期的评分漂移非常大，并且在近2年的评分过程中一直持续。评分者没有收敛到一个共同的严格程度水平；使用我们的模型，我们估计评分者之间的变异性在研究过程中实际上会增加。基于该模型的方差分解发现，相对于整体评分者效应、特定课程的评分者误差和残差误差，趋势是一个适度的方差来源。讨论部分提供了对趋势和评分者差异的可能解释，以及对随时间收集评分的设计的影响。

相似文献

Trends in Classroom Observation Scores.

Educ Psychol Meas. 2015 Apr;75(2):311-337. doi: 10.1177/0013164414539163. Epub 2014 Jun 22.

Rater characteristics, response content, and scoring contexts: Decomposing the determinates of scoring accuracy.

Front Psychol. 2022 Aug 10;13:937097. doi: 10.3389/fpsyg.2022.937097. eCollection 2022.

Inter-rater reliability and generalizability of patient note scores using a scoring rubric based on the USMLE Step-2 CS format.

Adv Health Sci Educ Theory Pract. 2016 Oct;21(4):761-73. doi: 10.1007/s10459-015-9664-3. Epub 2016 Jan 12.

The influence of fidelity of implementation on teacher-student interaction quality in the context of a randomized controlled trial of the Responsive Classroom approach.

J Sch Psychol. 2013 Aug;51(4):437-53. doi: 10.1016/j.jsp.2013.03.001. Epub 2013 Apr 22.

Making Inferences About Teacher Observation Scores Over Time.

Educ Psychol Meas. 2019 Aug;79(4):636-664. doi: 10.1177/0013164419826237. Epub 2019 Jan 30.

Selecting and Simplifying: Rater Performance and Behavior When Considering Multiple Competencies.

Teach Learn Med. 2016;28(1):41-51. doi: 10.1080/10401334.2015.1107489.

Can two psychotherapy process measures be dependably rated simultaneously? A generalizability study.

J Couns Psychol. 2012 Oct;59(4):638-44. doi: 10.1037/a0030037.

Incorporating Criterion Ratings Into Model-Based Rater Monitoring Procedures Using Latent-Class Signal Detection Theory.

Appl Psychol Meas. 2017 Sep;41(6):472-491. doi: 10.1177/0146621617698452. Epub 2017 Mar 27.

An investigation of the generalizability and dependability of direct behavior rating single item scales (DBR-SIS) to measure academic engagement and disruptive behavior of middle school students.

J Sch Psychol. 2010 Jun;48(3):219-46. doi: 10.1016/j.jsp.2010.02.001. Epub 2010 Mar 2.

Bias in psychotherapist ratings of client transference and insight.

Psychotherapy (Chic). 2007 Sep;44(3):300-15. doi: 10.1037/0033-3204.44.3.300.

引用本文的文献

Applying multivariate generalizability theory to compose and rank the evaluation scores of college teachers' teaching ability.

Sci Rep. 2025 Jul 5;15(1):24091. doi: 10.1038/s41598-025-08550-w.

Examining the Instructional Sensitivity of Constructed-Response Achievement Test Item Scores.

Educ Psychol Meas. 2025 Jan 30:00131644241313212. doi: 10.1177/00131644241313212.

Making Inferences About Teacher Observation Scores Over Time.

Educ Psychol Meas. 2019 Aug;79(4):636-664. doi: 10.1177/0013164419826237. Epub 2019 Jan 30.

What About the "Instruction" in Instructional Sensitivity? Raising a Validity Issue in Research on Instructional Sensitivity.

Educ Psychol Meas. 2018 Aug;78(4):635-652. doi: 10.1177/0013164417714846. Epub 2017 Jun 23.

A Multivariate Generalizability Theory Approach to College Students' Evaluation of Teaching.

Front Psychol. 2018 Jun 26;9:1065. doi: 10.3389/fpsyg.2018.01065. eCollection 2018.

本文引用的文献

Detecting score drift in a high-stakes performance-based assessment.

Adv Health Sci Educ Theory Pract. 2004;9(1):29-38. doi: 10.1023/B:AHSE.0000012214.40340.03.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

课堂观察分数趋势

Trends in Classroom Observation Scores.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献