Suppr超能文献

通过磁共振成像对强直性脊柱炎脊柱炎症活动进行评分:一项多阅片者实验

Scoring inflammatory activity of the spine by magnetic resonance imaging in ankylosing spondylitis: a multireader experiment.

作者信息

Lukas Cédric, Braun Jürgen, van der Heijde Désirée, Hermann Kay-Geert A, Rudwaleit Martin, Østergaard Mikkel, Oostveen Ans, O'Connor Phil, Maksymowych Walter P, Lambert Robert G W, Jurik Anne Grethe, Baraliakos Xenofon, Landewé Robert

机构信息

Department of Internal Medicine, Division of Rheumatology, University Hospital Maastricht, the Netherlands.

出版信息

J Rheumatol. 2007 Apr;34(4):862-70.

Abstract

OBJECTIVE

Magnetic resonance imaging (MRI) of the spine is increasingly important in the assessment of inflammatory activity in clinical trials with patients with ankylosing spondylitis (AS). We investigated feasibility, inter-reader reliability, sensitivity to change, and discriminatory ability of 3 different scoring methods for MRI activity and change in activity of the spine in patients with AS.

METHODS

Thirty sets of spinal MRI at baseline and after 24 weeks of followup, derived from a randomized clinical trial comparing a tumor necrosis factor (TNF)-blocking drug (n = 20) with placebo (n = 10) and selected to cover a wide range of activity at baseline and change in activity, were presented electronically in a partial latin-square design to 9 experienced readers from different countries (Europe, Canada). Readers scored each set of MRI 3 times, using 3 different methods including the Ankylosing Spondylitis spine Magnetic Resonance Imaging-activity [ASspiMRI-a, grading activity (0-6) per vertebral unit in 23 units]; the Berlin modification of the ASspiMRI-a; and the Spondyloarthritis Research Consortium of Canada (SPARCC) scoring system, which scores the 6 vertebral units considered by the reader as the most abnormal, with additional scores for "depth" and "intensity." Both the order of the methods used by each reader and the timepoints (before/after treatment) were randomized. Feasibility of each scoring system was evaluated by measuring the mean time needed to score each set of MRI, and inter-reader reliability was evaluated by smallest detectable change (SDC) and by intraclass correlation coefficients (ICC) for all readers together and for all possible reader pairs separately. Sensitivity to change was investigated by calculating Guyatt's effect size on change scores. Discriminatory ability was assessed using Z-scores (Mann-Whitney test) comparing change in score between patients treated with TNF-blocking drug and placebo.

RESULTS

The mean time to score one set of MRI was shortest for the Berlin method. SDC was lowest for the Berlin method and highest for SPARCC. Overall inter-reader ICC per method were between 0.49 and 0.77 for scoring activity status, and between 0.46 and 0.72 for scoring activity change. ICC for all possible reader pairs showed much more fluctuation per method, with lowest observed values of about 0.05 (very low agreement) and highest observed values over 0.90 (excellent agreement). In general, ICC for SPARCC were consistently higher than for other systems. Sensitivity to change differed per reader, and was more consistent with SPARCC than with the other methods, but was in general excellent for all 3 methods. Discrimination between groups (TNF-blocker vs placebo) assessed by Z-scores was good and comparable among methods.

CONCLUSION

This experiment demonstrates the feasibility of multiple-reader MRI scoring exercises for method comparison, provides evidence for the feasibility, reliability, sensitivity to change, and discriminatory capacity of all 3 tested scoring systems to be used in assessing spinal activity on MRI in patients with AS in clinical trials. On the basis of these results it is not possible to prioritize one of the 3 methods.

摘要

目的

在强直性脊柱炎(AS)患者的临床试验中,脊柱磁共振成像(MRI)对于评估炎症活动愈发重要。我们研究了3种不同评分方法用于评估AS患者脊柱MRI活动及活动度变化的可行性、阅片者间可靠性、对变化的敏感性以及鉴别能力。

方法

从一项随机临床试验中获取30组脊柱MRI,分别为基线期及随访24周后,该试验比较了肿瘤坏死因子(TNF)阻断药物(n = 20)与安慰剂(n = 10),所选MRI涵盖了广泛的基线活动度及活动度变化范围。以部分拉丁方设计将其以电子方式呈现给来自不同国家(欧洲、加拿大)的9位经验丰富的阅片者。阅片者使用3种不同方法对每组MRI进行3次评分,这3种方法包括强直性脊柱炎脊柱磁共振成像活动度评分[ASspiMRI-a,对23个椎体单元每个椎体单元的活动度进行分级(0 - 6级)];ASspiMRI-a的柏林改良版;以及加拿大脊柱关节炎研究联盟(SPARCC)评分系统,该系统对阅片者认为最异常的6个椎体单元进行评分,并对“深度”和“强度”进行额外评分。每位阅片者使用方法的顺序以及时间点(治疗前/后)均为随机安排。通过测量对每组MRI评分所需的平均时间来评估每个评分系统的可行性,通过最小可检测变化(SDC)以及组内相关系数(ICC)来评估阅片者间可靠性,ICC分别针对所有阅片者以及所有可能的阅片者对进行计算。通过计算盖亚特效应大小来研究对变化的敏感性。使用Z分数(曼-惠特尼检验)评估鉴别能力,比较TNF阻断药物治疗患者与安慰剂治疗患者的评分变化。

结果

柏林方法对一组MRI评分的平均时间最短。柏林方法的SDC最低,SPARCC的SDC最高。每种方法的总体阅片者间ICC在评估活动状态时为0.49至0.77,在评估活动度变化时为0.46至0.72。所有可能阅片者对的ICC在每种方法中波动更大,观察到的最低值约为0.05(一致性非常低),最高值超过0.90(一致性极佳)。总体而言,SPARCC的ICC始终高于其他系统。对变化的敏感性因阅片者而异,与SPARCC相比,其他方法的一致性更高,但总体上3种方法均表现出色。通过Z分数评估的组间(TNF阻断剂与安慰剂)鉴别良好,且各方法之间具有可比性。

结论

本实验证明了多阅片者MRI评分练习用于方法比较的可行性,为所有3种测试评分系统在临床试验中评估AS患者脊柱MRI活动度的可行性、可靠性、对变化的敏感性及鉴别能力提供了证据。基于这些结果,无法对这3种方法进行排序。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验