Suppr超能文献

应用于多标记调查项目的不同评分算法对结果评估的影响:一项关于健康相关知识的实地研究

Impact of different scoring algorithms applied to multiple-mark survey items on outcome assessment: an in-field study on health-related knowledge.

作者信息

Domnich A, Panatto D, Arata L, Bevilacqua I, Apprato L, Gasparini R, Amicizia D

机构信息

Department of Health Sciences, University of Genoa, Italy.

出版信息

J Prev Med Hyg. 2015;56(4):E162-71.

Abstract

INTRODUCTION

Health-related knowledge is often assessed through multiple-choice tests. Among the different types of formats, researchers may opt to use multiple-mark items, i.e. with more than one correct answer. Although multiple-mark items have long been used in the academic setting - sometimes with scant or inconclusive results - little is known about the implementation of this format in research on in-field health education and promotion.

METHODS

A study population of secondary school students completed a survey on nutrition-related knowledge, followed by a single- lecture intervention. Answers were scored by means of eight different scoring algorithms and analyzed from the perspective of classical test theory. The same survey was re-administered to a sample of the students in order to evaluate the short-term change in their knowledge.

RESULTS

In all, 286 questionnaires were analyzed. Partial scoring algorithms displayed better psychometric characteristics than the dichotomous rule. In particular, the algorithm proposed by Ripkey and the balanced rule showed greater internal consistency and relative efficiency in scoring multiple-mark items. A penalizing algorithm in which the proportion of marked distracters was subtracted from that of marked correct answers was the only one that highlighted a significant difference in performance between natives and immigrants, probably owing to its slightly better discriminatory ability. This algorithm was also associated with the largest effect size in the pre-/post-intervention score change.

DISCUSSION

The choice of an appropriate rule for scoring multiple- mark items in research on health education and promotion should consider not only the psychometric properties of single algorithms but also the study aims and outcomes, since scoring rules differ in terms of biasness, reliability, difficulty, sensitivity to guessing and discrimination.

摘要

引言

与健康相关的知识通常通过多项选择题测试来评估。在不同类型的题型中,研究人员可能会选择使用多标记题目,即有多个正确答案的题目。尽管多标记题目长期以来一直在学术环境中使用——有时结果甚微或尚无定论——但对于这种题型在现场健康教育与促进研究中的应用知之甚少。

方法

以中学生为研究对象,完成了一项关于营养相关知识的调查,随后进行了一次讲座干预。答案通过八种不同的评分算法进行评分,并从经典测试理论的角度进行分析。为了评估学生知识的短期变化,对部分学生样本重新进行了相同的调查。

结果

总共分析了286份问卷。部分评分算法显示出比二分法规则更好的心理测量特征。特别是,里普基提出的算法和平衡规则在对多标记题目进行评分时显示出更高的内部一致性和相对效率。一种惩罚算法,即从标记的正确答案比例中减去标记的干扰项比例,是唯一突出显示本地人和移民在表现上存在显著差异的算法,这可能是由于其稍好的区分能力。该算法在干预前/后分数变化中也与最大的效应量相关。

讨论

在健康教育与促进研究中,选择合适的多标记题目评分规则不仅应考虑单个算法的心理测量特性,还应考虑研究目的和结果,因为评分规则在偏差、可靠性、难度、对猜测的敏感性和区分度方面存在差异。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b60f/4753817/89816a129ada/2421-4248-56-E162-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验