Flinders Innovation in Clinical Education, Flinders University, South Australia, Australia.
Med Educ. 2012 Jan;46(1):38-48. doi: 10.1111/j.1365-2923.2011.04098.x.
Programmatic assessment is a notion that implies that the strength of the assessment process results from a careful combination of various assessment instruments. Accordingly, no single instrument is superior to another, but each has its own strengths, weaknesses and purpose in a programme. Yet, in terms of psychometric methods, a one-size-fits-all approach is often used. Kane's views on validity as represented by a series of arguments provide a useful framework from which to highlight the value of different widely used approaches to improve the quality and validity of assessment procedures.
In this paper we discuss four inferences which form part of Kane's validity theory: from observations to scores; from scores to universe scores; from universe scores to target domain, and from target domain to construct. For each of these inferences, we provide examples and descriptions of approaches and arguments that may help to support the validity inference.
As well as standard psychometric methods, a programme of assessment makes use of various other arguments, such as: item review and quality control, structuring and examiner training; probabilistic methods, saturation approaches and judgement processes, and epidemiological methods, collation, triangulation and member-checking procedures. In an assessment programme each of these can be used.
计划性评估是一种观念,它意味着评估过程的强度来自于对各种评估工具的精心组合。因此,没有一种工具比另一种更优越,但每种工具在一个项目中都有其自身的优势、劣势和目的。然而,就心理测量方法而言,通常采用一刀切的方法。凯恩的一系列观点代表了有效性理论,为强调不同广泛使用的方法在提高评估程序的质量和有效性方面的价值提供了一个有用的框架。
在本文中,我们讨论了构成凯恩有效性理论的四个推断:从观察到分数;从分数到总体分数;从总体分数到目标领域,从目标领域到结构。对于这些推断中的每一个,我们提供了可以帮助支持有效性推断的方法和论点的示例和描述。
除了标准的心理测量方法外,评估计划还利用了各种其他论点,例如:项目审查和质量控制、结构和考官培训;概率方法、饱和方法和判断过程,以及流行病学方法、整理、三角测量和成员检查程序。在评估计划中,这些方法都可以使用。