Gadbury-Amyot Cynthia C, Kim Juhu, Palm Richard L, Mills G Edward, Noble Elizabeth, Overman Pamela R
Division of Dental Hygiene, School of Dentistry, University of Missouri-Kansas City, 64108, USA.
J Dent Educ. 2003 Sep;67(9):991-1002.
This study examined the validity and reliability of portfolio assessment using Messick's unified framework of construct validity. Theoretical and empirical evidence was sought for six aspects of construct validity. Seven faculty raters evaluated twenty student portfolios using a primary trait analysis scoring rubric. A significant relationship (r = .81-.95; p < .01) between the seven subscales in the scoring rubric demonstrates measurement of a common construct. There was a significant relationship between portfolios and GPA (r = .70; p < .01) and the NBDHE (r = .60; p < .01). The relationship between portfolios and the Central Regional Dental Testing Service (CRDTS) examination was both weak and nonsignificant (r = .19; p > .05). A fully crossed, two-facet generalizability (G) study design was used to examine reliability. ANOVA demonstrated that the greatest source of variance was the scoring rubric itself, accounting for 78 percent of the total variance. The smallest source of variance was the interaction between portfolio and rubric (1.15 percent). Faculty rater variance accounted for only 1.28 percent of total variance. A phi coefficient of .86, analogous to a reliability coefficient in classical test theory, was obtained in the decision study by increasing the subscales to fourteen and decreasing faculty raters to three. In conclusion, the pattern of findings from this study suggests that portfolios can serve as a valid and reliable measure for assessing student competency.
本研究使用梅西克的结构效度统一框架检验了档案袋评估的效度和信度。从结构效度的六个方面寻求理论和实证证据。七名教师评分者使用主要特质分析评分标准对二十份学生档案袋进行了评估。评分标准中的七个分量表之间存在显著关系(r = .81-.95;p < .01),表明测量的是一个共同的结构。档案袋与GPA(r = .70;p < .01)和国家牙科卫生员考试(NBDHE,r = .60;p < .01)之间存在显著关系。档案袋与中部地区牙科测试服务(CRDTS)考试之间的关系既微弱又不显著(r = .19;p > .05)。采用完全交叉的双因素概化(G)研究设计来检验信度。方差分析表明,最大的方差来源是评分标准本身,占总方差的78%。最小的方差来源是档案袋与评分标准之间的交互作用(1.15%)。教师评分者方差仅占总方差的1.28%。在决策研究中,通过将分量表增加到14个并将教师评分者减少到3个,获得了类似于经典测试理论中信度系数的0.86的phi系数。总之,本研究的结果模式表明,档案袋可以作为评估学生能力的有效且可靠的手段。