Health and Community Care Research Unit, University of Liverpool, Liverpool, UK.
Health Technol Assess. 2013 Oct;17(50):i-xiv, 1-128. doi: 10.3310/hta17500.
This review systematically examines the research literature published in the period 2002-8 on structured violence risk assessment instruments designed for use in mental health services or the criminal justice system. It adopted much broader inclusion criteria than previous reviews in the same area in order to capture and summarise data on the widest possible range of available instruments.
To address two questions: (1) what study characteristics are associated with a risk assessment instrument score being significantly associated with a violent outcome? and (2) which risk assessment instruments have the highest level of predictive validity for a violent outcome?
Nineteen bibliographic databases were searched from January 2002 to April 2008, including PsycINFO, MEDLINE, Cumulative Index to Nursing and Allied Health Literature, Allied and Complementary Medicine Database, British Nursing Index, International Bibliography of the Social Sciences, Education Resources Information Centre, The Cochrane Library and Web of Knowledge.
Inclusion criteria for studies were (1) evaluation of a structured risk tool; (2) outcome measure of interpersonal violence; (3) participants aged 17 years or over; and (4) participants with a mental disorder and/or at least one offence and/or at least one indictable offence. A series of bivariate analyses using either a chi-squared test or Spearman's rank-order correlation were conducted to explore associations between study characteristics and outcomes. Data from a subset of studies reporting area under the curve (AUC) analysis were combined to provide estimates of mean validity.
For the overall set of included studies (n = 959), over three-quarters (77%) were conducted in the USA, Canada or the UK. Two-thirds of all studies were conducted with offenders who had either no formal mental health diagnosis (43%) or forensic samples with a formal diagnosis (25%). The Psychopathy Checklist-Revised was tested in the largest number of studies (n = 192). Most studies (78%) reported a statistically significant (p < 0.05) relationship between the instrument score and a violent outcome. Prospective data collection (chi-squared = 4.4, p = 0.035), number of people recruited (U = 27.8, p = 0.012) and number of participants at end point (U = 26.9, p = 0.04) were significantly associated with predictive validity. For those instruments tested in five or more studies reporting AUC values, the General Statistical Information on Recidivism instrument had the highest mean AUC (0.73).
Agreement between pairs of reviewers in the initial pilot exercises was good but less than perfect, so discrepancies may be present given the complexity and subjectivity of some aspects of violence research. Only five of the seven calendar years (2003-7) are completely covered, with partial coverage of 2002 and 2008. There is no weighting for sample or effect sizes when results from studies are aggregated.
A very large number of studies examining the relationship between a structured instrument and a violent outcome were published in this relatively short 7-year period. The general quality of the literature is weak in places (e.g. over-reliance on cross-sectional designs) and a vast range of distinct instruments have been tested to varying degrees. However, there is evidence of some convergence around a small number of high-performing instruments and identification of the components of a high-quality evaluation approach, including AUC analysis. The upper limits (AUC ≥ 0.85) of instrument-based prediction have probably been achieved and are unlikely to be exceeded using instruments alone.
The National Institute for Health Research Health Technology Assessment and Research for Patient Benefit programmes.
本综述系统地审查了 2002 年至 2008 年期间发表的关于旨在用于精神卫生服务或刑事司法系统的结构化暴力风险评估工具的研究文献。它采用了比该领域以前的综述更广泛的纳入标准,以便收集和总结尽可能广泛的可用工具的相关数据。
为了解决两个问题:(1) 哪些研究特征与风险评估工具评分与暴力结果显著相关有关?(2) 哪些风险评估工具对暴力结果具有最高的预测准确性?
从 2002 年 1 月至 2008 年 4 月,检索了 19 个书目数据库,包括 PsycINFO、MEDLINE、护理学及相关健康文献累积索引、补充与综合医学数据库、英国护理索引、国际社会科学文献索引、教育资源信息中心、考科兰图书馆和 Web of Knowledge。
纳入研究的标准为:(1) 评估结构化风险工具;(2) 人际暴力的结果测量;(3) 参与者年龄为 17 岁或以上;(4) 参与者有精神障碍和/或至少一次犯罪和/或至少一次可起诉犯罪。使用卡方检验或斯皮尔曼等级相关分析进行了一系列双变量分析,以探索研究特征与结果之间的关联。从报告曲线下面积(AUC)分析的一部分研究中汇总数据,以提供平均有效性的估计值。
对于总体纳入的研究(共 959 项),超过四分之三(77%)是在美国、加拿大或英国进行的。所有研究中,有三分之二的参与者是没有正式精神健康诊断的罪犯(43%)或有正式诊断的法医学样本(25%)。《精神病理检查表修订版》在最多的研究中进行了测试(n=192)。大多数研究(78%)报告了工具评分与暴力结果之间存在统计学显著关系(p<0.05)。前瞻性数据收集(卡方=4.4,p=0.035)、招募人数(U=27.8,p=0.012)和终点人数(U=26.9,p=0.04)与预测准确性显著相关。对于在五项或五项以上报告 AUC 值的研究中测试的工具,一般统计累犯信息工具具有最高的平均 AUC(0.73)。
在初始试点练习中,两位审查员之间的一致性很好,但并不完美,因此考虑到暴力研究的某些方面的复杂性和主观性,可能存在差异。仅涵盖了七个日历年中的五个(2003-2007 年),2002 年和 2008 年部分涵盖。在汇总研究结果时,没有对样本或效果大小进行加权。
在相对较短的 7 年时间内,发表了大量研究评估结构化工具与暴力结果之间的关系。文献的总体质量在某些方面存在不足之处(例如过度依赖横断面设计),并且已经对大量不同的仪器进行了不同程度的测试。然而,已经有证据表明,在少数表现良好的仪器上存在一定程度的趋同,并确定了高质量评估方法的组成部分,包括 AUC 分析。基于仪器的预测的上限(AUC≥0.85)可能已经达到,并且仅使用仪器不太可能超过。
英国国家卫生研究院健康技术评估和研究为患者利益计划。