Suppr超能文献

使用不同评估量表时,磁共振成像中腰椎椎间孔狭窄分类的观察者间变异性。

Inter-observer variability in the classification of lumbar foraminal stenosis in magnetic resonance imaging using different evaluation scales.

作者信息

Sá Silva José, Pereira Ana, Abreu Vasco, Filipe João Pedro

机构信息

Department of Neuroradiology, Centro Hospitalar Universitário de Santo António, Unidade Local de Saúde de Santo António, Porto, Portugal.

出版信息

Eur Spine J. 2025 Mar;34(3):869-873. doi: 10.1007/s00586-024-08612-z. Epub 2024 Dec 20.

Abstract

BACKGROUND

The evaluation of lumbar spine degeneration on magnetic resonance imaging (MRI) is prone to inter-reader variability, including when assessing foraminal changes. This variability, often due to subjective criteria and inconsistent terminology, may affect clinical correlations. Standardized criteria could help improve agreement among readers.

MATERIALS AND METHODS

MRI of the lumbar spine of 50 randomly selected patients were evaluated by 12 independent readers. Foraminal stenosis was assessed using four different rating scales for each patient. The first scale classified stenosis as presence/absence of neurologic compromise of the spinal nerve root at the foramen, the second scale classified stenosis as absent/mild/moderate/severe, the third scale as normal/contact of disk or osteophyte with the nerve root/deviation of the nerve root/compression of the nerve root, and the fourth scale utilized the Lee et al. criteria. Agreement analysis was performed using Fleiss' kappa coefficients.

RESULTS

Agreement was moderate using the first scale (k = 0.439), and significantly lower using the second, third and fourth scales (k = 0.310, k = 0.311, k = 0.295, respectively). When comparing the agreements obtained between board certified neuroradiologists and between neuroradiology residents, there was statistically significant differences when using the third and fourth scales, where the agreement for board certified neuroradiologists was higher, but still only fair. Individual kappas showed that in the second, third, and fourth scales the levels of agreement were higher in the extremes of the scale, namely, when there was no stenosis or when the stenosis was maximal with nerve compression.

CONCLUSIONS

Levels of agreement can differ depending on the scale used. Simpler dichotomous scales may return higher levels of agreement compared to more complex ones. For the non-dichotomous scales, using different scales may not result in overall different levels of agreement. Given the overall low inter-rater agreements observed, there is probably significant potential to enhance agreement through more rigorous training and consensus-building.

摘要

背景

磁共振成像(MRI)对腰椎退变的评估容易出现阅片者之间的差异,包括在评估椎间孔变化时。这种差异通常是由于主观标准和术语不一致造成的,可能会影响临床相关性。标准化标准有助于提高阅片者之间的一致性。

材料与方法

12名独立阅片者对随机选取的50例患者的腰椎MRI进行评估。对每位患者使用四种不同的评分量表评估椎间孔狭窄情况。第一种量表将狭窄分为椎间孔处脊神经根有无神经功能损害,第二种量表将狭窄分为无/轻度/中度/重度,第三种量表分为正常/椎间盘或骨赘与神经根接触/神经根移位/神经根受压,第四种量表采用Lee等人的标准。使用Fleiss' kappa系数进行一致性分析。

结果

使用第一种量表时一致性为中等(k = 0.439),使用第二、第三和第四种量表时显著降低(分别为k = 0.310、k = 0.311、k = 0.295)。比较获得委员会认证的神经放射科医生之间以及神经放射科住院医师之间的一致性时发现,使用第三和第四种量表时存在统计学显著差异,委员会认证的神经放射科医生之间的一致性更高,但仍仅为一般。个体kappa系数显示,在第二、第三和第四种量表中,量表两端的一致性水平较高,即无狭窄或狭窄最大且伴有神经受压时。

结论

一致性水平可能因所使用的量表而异。与更复杂的量表相比,更简单的二分法量表可能会有更高的一致性水平。对于非二分法量表,使用不同量表可能不会导致总体上不同的一致性水平。鉴于观察到的总体评分者间一致性较低,通过更严格的培训和建立共识可能有很大潜力提高一致性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验