Department of Mental Health Law and Policy, University of South Florida, 13301 Bruce B. Downs Blvd., Tampa, FL 33612, USA.
Behav Sci Law. 2013 Jan-Feb;31(1):55-73. doi: 10.1002/bsl.2053.
The objective of the present review was to examine how predictive validity is analyzed and reported in studies of instruments used to assess violence risk. We reviewed 47 predictive validity studies published between 1990 and 2011 of 25 instruments that were included in two recent systematic reviews. Although all studies reported receiver operating characteristic curve analyses and the area under the curve (AUC) performance indicator, this methodology was defined inconsistently and findings often were misinterpreted. In addition, there was between-study variation in benchmarks used to determine whether AUCs were small, moderate, or large in magnitude. Though virtually all of the included instruments were designed to produce categorical estimates of risk - through the use of either actuarial risk bins or structured professional judgments - only a minority of studies calculated performance indicators for these categorical estimates. In addition to AUCs, other performance indicators, such as correlation coefficients, were reported in 60% of studies, but were infrequently defined or interpreted. An investigation of sources of heterogeneity did not reveal significant variation in reporting practices as a function of risk assessment approach (actuarial vs. structured professional judgment), study authorship, geographic location, type of journal (general vs. specialized audience), sample size, or year of publication. Findings suggest a need for standardization of predictive validity reporting to improve comparison across studies and instruments.
本研究旨在探讨评估暴力风险的工具的研究中,预测效度是如何进行分析和报告的。我们回顾了 1990 年至 2011 年间发表的 47 项预测效度研究,这些研究涉及了最近两项系统评价中纳入的 25 种工具。虽然所有研究都报告了接收者操作特征曲线分析和曲线下面积(AUC)表现指标,但这种方法的定义不一致,结果往往被误解。此外,用于确定 AUC 大小的基准在研究之间存在差异。尽管几乎所有纳入的工具都是为了通过使用风险仓或结构化专业判断来产生风险的分类估计而设计的,但只有少数研究计算了这些分类估计的表现指标。除了 AUC 之外,还有 60%的研究报告了其他表现指标,如相关系数,但这些指标很少被定义或解释。对异质性来源的调查并没有发现报告实践因风险评估方法(计量与结构化专业判断)、研究作者、地理位置、期刊类型(一般与专业受众)、样本量或发表年份的不同而存在显著差异。研究结果表明,需要对预测效度报告进行标准化,以提高研究和工具之间的可比性。