Grucza Richard A, Goldberg Lewis R
Department of Psychiatry, Washington University School of Medicine, USA.
J Pers Assess. 2007 Oct;89(2):167-87. doi: 10.1080/00223890701468568.
In science, multiple measures of the same constructs can be useful, but they are unlikely to all be equally valid indicators. In psychological assessment, the many popular personality inventories available in the marketplace also may be useful, but their comparative validity has long remained unassessed. This is the first comprehensive comparison of 11 such multiscale instruments against each of three types of criteria: clusters of behavioral acts, descriptions by knowledgeable informants, and clinical indicators potentially associated with various types of psychopathology. Using 1,000 bootstrap resampling analyses from a sample of roughly 700 adult research participants, we assess the relative predictability of each criterion and the comparative validity of each inventory. Although there was a wide range of criterion predictability, most inventories exhibited quite similar cross-validities when averaged across all three types of criteria. On the other hand, there were important differences between inventories in their predictive capabilities for particular criteria. We discuss the factors that lead to differential validity across predictors and criteria.
在科学领域,对同一结构进行多种测量可能是有用的,但它们不太可能都是同样有效的指标。在心理评估中,市场上众多流行的人格量表可能也有用,但它们的相对效度长期以来一直未得到评估。这是对11种此类多维度工具与三种标准类型中的每一种进行的首次全面比较:行为行为集群、有见识的知情者的描述以及可能与各种精神病理学类型相关的临床指标。我们从大约700名成年研究参与者的样本中进行了1000次自助重抽样分析,评估了每个标准的相对可预测性以及每个量表的比较效度。尽管标准的可预测性范围很广,但当对所有三种标准类型进行平均时,大多数量表表现出相当相似的交叉效度。另一方面,各量表在对特定标准的预测能力方面存在重要差异。我们讨论了导致预测指标和标准之间效度差异的因素。