Institute of Health and Society, Newcastle University, 21 Claremont Place, Newcastle upon Tyne, NE2 4AA, UK.
Implement Sci. 2010 Feb 26;5:20. doi: 10.1186/1748-5908-5-20.
Studies included in a related systematic review used a variety of statistical methods to summarise clinical behaviour and to compare proxy (or indirect) and direct (observed) methods of measuring it. The objective of the present review was to assess the validity of these statistical methods and make appropriate recommendations.
Electronic bibliographic databases were searched to identify studies meeting specified inclusion criteria. Potentially relevant studies were screened for inclusion independently by two reviewers. This was followed by systematic abstraction and categorization of statistical methods, as well as critical assessment of these methods.
Fifteen reports (of 11 studies) met the inclusion criteria. Thirteen analysed individual clinical actions separately and presented a variety of summary statistics: sensitivity was available in eight reports and specificity in six, but four reports treated different actions interchangeably. Seven reports combined several actions into summary measures of behaviour: five reports compared means on direct and proxy measures using analysis of variance or t-tests; four reported the Pearson correlation; none compared direct and proxy measures over the range of their values. Four reports comparing individual items used appropriate statistical methods, but reports that compared summary scores did not.
We recommend sensitivity and positive predictive value as statistics to assess agreement of direct and proxy measures of individual clinical actions. Summary measures should be reliable, repeatable, capture a single underlying aspect of behaviour, and map that construct onto a valid measurement scale. The relationship between the direct and proxy measures should be evaluated over the entire range of the direct measure and describe not only the mean of the proxy measure for any specific value of the direct measure, but also the range of variability of the proxy measure. The evidence about the relationship between direct and proxy methods of assessing clinical behaviour is weak.
相关系统评价中纳入的研究使用了各种统计方法来总结临床行为,并比较代理(或间接)和直接(观察)测量方法。本综述的目的是评估这些统计方法的有效性,并提出适当的建议。
电子书目数据库被搜索以确定符合特定纳入标准的研究。两名评审员独立筛选潜在相关的研究以确定其是否纳入。接着对统计方法进行系统的抽象和分类,并对这些方法进行批判性评估。
15 份报告(来自 11 项研究)符合纳入标准。13 项研究分别分析了单个临床行为,并呈现了各种汇总统计数据:8 份报告提供了灵敏度,6 份报告提供了特异性,但 4 份报告将不同的行为互换处理。7 份报告将几种行为组合成行为的综合测量:5 份报告使用方差分析或 t 检验比较直接和代理测量的平均值;4 份报告报告了 Pearson 相关系数;没有一份报告比较了直接和代理测量在其值范围内的关系。4 份比较单个项目的报告使用了适当的统计方法,但比较汇总分数的报告没有。
我们建议使用灵敏度和阳性预测值作为评估直接和代理测量个体临床行为一致性的统计方法。综合测量应该是可靠的、可重复的,捕捉行为的单一潜在方面,并将该结构映射到有效的测量尺度上。直接和代理测量之间的关系应该在直接测量的整个范围内进行评估,不仅描述直接测量特定值时代理测量的平均值,还描述代理测量的可变性范围。直接和代理评估临床行为方法之间关系的证据很薄弱。