The Sax Institute, Level 13, Building 10, 235 Jones Street, Ultimo, New South Wales, 2007, Australia.
National Centre for Epidemiology and Population Health (NCEPH), Research School of Population Health, The Australian National University, 62 Mills Road, Acton, Australian Capital Territory, 0200, Australia.
Implement Sci. 2017 Dec 19;12(1):149. doi: 10.1186/s13012-017-0676-7.
Few measures of research use in health policymaking are available, and the reliability of such measures has yet to be evaluated. A new measure called the Staff Assessment of Engagement with Evidence (SAGE) incorporates an interview that explores policymakers' research use within discrete policy documents and a scoring tool that quantifies the extent of policymakers' research use based on the interview transcript and analysis of the policy document itself. We aimed to conduct a preliminary investigation of the usability, sensitivity, and reliability of the scoring tool in measuring research use by policymakers.
Nine experts in health policy research and two independent coders were recruited. Each expert used the scoring tool to rate a random selection of 20 interview transcripts, and each independent coder rated 60 transcripts. The distribution of scores among experts was examined, and then, interrater reliability was tested within and between the experts and independent coders. Average- and single-measure reliability coefficients were computed for each SAGE subscales.
Experts' scores ranged from the limited to extensive scoring bracket for all subscales. Experts as a group also exhibited at least a fair level of interrater agreement across all subscales. Single-measure reliability was at least fair except for three subscales: Relevance Appraisal, Conceptual Use, and Instrumental Use. Average- and single-measure reliability among independent coders was good to excellent for all subscales. Finally, reliability between experts and independent coders was fair to excellent for all subscales.
Among experts, the scoring tool was comprehensible, usable, and sensitive to discriminate between documents with varying degrees of research use. Secondly, the scoring tool yielded scores with good reliability among the independent coders. There was greater variability among experts, although as a group, the tool was fairly reliable. The alignment between experts' and independent coders' ratings indicates that the independent coders were scoring in a manner comparable to health policy research experts. If the present findings are replicated in a larger sample, end users (e.g. policy agency staff) could potentially be trained to use SAGE to reliably score research use within their agencies, which would provide a cost-effective and time-efficient approach to utilising this measure in practice.
目前用于评估健康政策制定中研究使用情况的方法很少,并且这些方法的可靠性尚未得到验证。一种新的评估方法称为工作人员对证据的参与度评估(SAGE),它包含一个访谈,该访谈探讨了决策者在离散政策文件中使用研究的情况,以及一个评分工具,该工具根据访谈记录和政策文件本身的分析,对决策者使用研究的程度进行量化。我们旨在初步研究评分工具在衡量决策者研究使用情况方面的可用性、敏感性和可靠性。
招募了 9 名卫生政策研究专家和 2 名独立编码员。每位专家使用评分工具对 20 份随机选择的访谈记录进行评分,每位独立编码员对 60 份记录进行评分。检查了专家之间的分数分布,然后测试了专家和独立编码员之间的内部和外部评分的可靠性。为每个 SAGE 子量表计算了平均和单项可靠性系数。
所有子量表的专家评分范围均从有限到广泛。专家组在所有子量表上的评分也具有至少是公平水平的内部评分一致性。除了三个子量表(相关性评估、概念性使用和工具性使用)之外,单项可靠性至少为公平。独立编码员之间的平均和单项可靠性在所有子量表上均为良好至优秀。最后,专家和独立编码员之间的可靠性在所有子量表上均为公平至优秀。
在专家中,该评分工具易于理解、可用,并且能够区分具有不同研究使用程度的文件。其次,该评分工具在独立编码员中产生了可靠的分数。尽管专家之间存在更大的差异,但作为一个整体,该工具具有相当的可靠性。专家和独立编码员之间的评分一致性表明,独立编码员的评分方式与卫生政策研究专家相似。如果在更大的样本中复制本研究结果,最终用户(例如政策机构工作人员)可以接受培训,以使用 SAGE 在其机构内可靠地对研究使用情况进行评分,这将提供一种具有成本效益和时间效率的实践方法。