基于芬兰 1966 年出生队列的脊柱网络外部验证，一种用于腰椎间盘退变 MRI 特征分级的开源深度学习模型

External Validation of SpineNet, an Open-Source Deep Learning Model for Grading Lumbar Disk Degeneration MRI Features, Using the Northern Finland Birth Cohort 1966.

机构信息

Research Unit of Health Sciences and Technology, University of Oulu.

Finnish Institute of Occupational Health.

出版信息

Spine (Phila Pa 1976). 2023 Apr 1;48(7):484-491. doi: 10.1097/BRS.0000000000004572. Epub 2022 Dec 30.

DOI:10.1097/BRS.0000000000004572

PMID:36728678

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9990601/

Abstract

STUDY DESIGN

This is a retrospective observational study to externally validate a deep learning image classification model.

OBJECTIVE

Deep learning models such as SpineNet offer the possibility of automating the process of disk degeneration (DD) classification from magnetic resonance imaging (MRI). External validation is an essential step to their development. The aim of this study was to externally validate SpineNet predictions for DD using Pfirrmann classification and Modic changes (MCs) on data from the Northern Finland Birth Cohort 1966 (NFBC1966).

SUMMARY OF DATA

We validated SpineNet using data from 1331 NFBC1966 participants for whom both lumbar spine MRI data and consensus DD gradings were available.

MATERIALS AND METHODS

SpineNet returned Pfirrmann grade and MC presence from T2-weighted sagittal lumbar MRI sequences from NFBC1966, a data set geographically and temporally separated from its training data set. A range of agreement and reliability metrics were used to compare predictions with expert radiologists. Subsets of data that match SpineNet training data more closely were also tested.

RESULTS

Balanced accuracy for DD was 78% (77%-79%) and for MC 86% (85%-86%). Interrater reliability for Pfirrmann grading was Lin concordance correlation coefficient=0.86 (0.85-0.87) and Cohen κ=0.68 (0.67-0.69). In a low back pain subset, these reliability metrics remained largely unchanged. In total, 20.83% of disks were rated differently by SpineNet compared with the human raters, but only 0.85% of disks had a grade difference >1. Interrater reliability for MC detection was κ=0.74 (0.72-0.75). In the low back pain subset, this metric was almost unchanged at κ=0.76 (0.73-0.79).

CONCLUSIONS

In this study, SpineNet has been benchmarked against expert human raters in the research setting. It has matched human reliability and demonstrates robust performance despite the multiple challenges facing model generalizability.

摘要

研究设计

这是一项回顾性观察研究，旨在对深度学习图像分类模型进行外部验证。

目的

SpineNet 等深度学习模型提供了从磁共振成像 (MRI) 自动分类椎间盘退变 (DD) 的可能性。外部验证是其发展的必要步骤。本研究的目的是使用 Pfirrmann 分类和 Modic 变化 (MCs) 对来自芬兰北部出生队列 1966 年 (NFBC1966) 的数据，对 SpineNet 对 DD 的预测进行外部验证。

数据概要

我们使用 1331 名 NFBC1966 参与者的数据验证了 SpineNet，这些参与者均有腰椎 MRI 数据和共识 DD 分级。

材料和方法

SpineNet 从 NFBC1966 的 T2 加权矢状腰椎 MRI 序列返回 Pfirrmann 分级和 MC 存在，该数据集在地理位置和时间上与训练数据集分开。使用一系列一致性和可靠性指标来比较与专家放射科医生的预测。还测试了与 SpineNet 训练数据更匹配的数据子集。

结果

DD 的平衡准确率为 78%（77%-79%），MC 为 86%（85%-86%）。Pfirrmann 分级的组内相关系数为 0.86（0.85-0.87），Cohen κ为 0.68（0.67-0.69）。在腰痛亚组中，这些可靠性指标基本保持不变。总的来说，与人类评估者相比，SpineNet 对 20.83%的椎间盘进行了不同的评级，但只有 0.85%的椎间盘分级差异超过 1 级。MC 检测的组内相关系数为 κ=0.74（0.72-0.75）。在腰痛亚组中，该指标几乎没有变化，κ=0.76（0.73-0.79）。

结论

在这项研究中，SpineNet 在研究环境中与专家人类评估者进行了基准测试。它与人类可靠性相匹配，并且在面临模型可泛化性的多个挑战的情况下表现出强大的性能。

相似文献

External Validation of SpineNet, an Open-Source Deep Learning Model for Grading Lumbar Disk Degeneration MRI Features, Using the Northern Finland Birth Cohort 1966.基于芬兰 1966 年出生队列的脊柱网络外部验证，一种用于腰椎间盘退变 MRI 特征分级的开源深度学习模型

Spine (Phila Pa 1976). 2023 Apr 1;48(7):484-491. doi: 10.1097/BRS.0000000000004572. Epub 2022 Dec 30.

External validation of the deep learning system "SpineNet" for grading radiological features of degeneration on MRIs of the lumbar spine.深度学习系统“SpineNet”对腰椎 MRI 退变放射学特征分级的外部验证。

Eur Spine J. 2022 Aug;31(8):2137-2148. doi: 10.1007/s00586-022-07311-x. Epub 2022 Jul 14.

Hybrid Bone Single Photon Emission Computed Tomography Imaging in Evaluation of Chronic Low Back Pain: Correlation with Modic Changes and Degenerative Disc Disease.混合骨单光子发射计算机断层扫描成像在慢性下腰痛评估中的应用：与Modic改变和椎间盘退变疾病的相关性

World Neurosurg. 2017 Aug;104:816-823. doi: 10.1016/j.wneu.2017.03.107. Epub 2017 Apr 2.

Semiautomatic Assessment of Facet Tropism From Lumbar Spine MRI Using Deep Learning: A Northern Finland Birth Cohort Study.使用深度学习半自动评估腰椎 MRI 中的关节突偏斜：一项芬兰北部出生队列研究。

Spine (Phila Pa 1976). 2024 May 1;49(9):630-639. doi: 10.1097/BRS.0000000000004909. Epub 2023 Dec 18.

The Pfirrmann classification of lumbar intervertebral disc degeneration: an independent inter- and intra-observer agreement assessment.腰椎间盘退变的 Pfirrmann 分类：一项观察者间和观察者内一致性的独立评估。

Eur Spine J. 2016 Sep;25(9):2728-33. doi: 10.1007/s00586-016-4438-z. Epub 2016 Feb 15.

Association Between Modic Changes and Low Back Pain in Middle Age: A Northern Finland Birth Cohort Study.中年人群中 Modic 改变与下腰痛的相关性：一项芬兰北部出生队列研究。

Spine (Phila Pa 1976). 2020 Oct 1;45(19):1360-1367. doi: 10.1097/BRS.0000000000003529.

A Deep Learning Model for the Accurate and Reliable Classification of Disc Degeneration Based on MRI Data.基于 MRI 数据的用于 discs 退变准确可靠分类的深度学习模型。

Invest Radiol. 2021 Feb 1;56(2):78-85. doi: 10.1097/RLI.0000000000000709.

Why Are Some Intervertebral Discs More Prone to Degeneration?: Insights Into Isolated Thoracic "Dysgeneration".为何有些椎间盘更容易退变？：对孤立性胸椎间盘“退变”的深入认识。

Spine (Phila Pa 1976). 2023 Jun 15;48(12):E177-E187. doi: 10.1097/BRS.0000000000004632. Epub 2023 Mar 22.

Novel Application of the Pfirrmann Disc Degeneration Grading System to 9.4T MRI: Higher Reliability Compared to 3T MRI.Pfirrmann 椎间盘退变分级系统在 9.4T MRI 中的新应用：与 3T MRI 相比具有更高的可靠性。

Spine (Phila Pa 1976). 2019 Jul 1;44(13):E766-E773. doi: 10.1097/BRS.0000000000002967.

Quantitative synthetic MRI for evaluation of the lumbar intervertebral disk degeneration in patients with chronic low back pain.定量磁共振成像在慢性下腰痛患者腰椎间盘退变评估中的应用。

Eur J Radiol. 2020 Mar;124:108858. doi: 10.1016/j.ejrad.2020.108858. Epub 2020 Jan 29.

引用本文的文献

Advances and challenges in AI-assisted MRI for lumbar disc degeneration detection and classification.用于腰椎间盘退变检测与分类的人工智能辅助磁共振成像的进展与挑战

Eur Spine J. 2025 Jul 25. doi: 10.1007/s00586-025-09179-z.

Comparison of lumbar disc degeneration grading between deep learning model SpineNet and radiologist: a longitudinal study with a 14-year follow-up.深度学习模型SpineNet与放射科医生对腰椎间盘退变分级的比较：一项为期14年随访的纵向研究

Eur Spine J. 2025 May 15. doi: 10.1007/s00586-025-08900-2.

Enhancing Radiologist Productivity with Artificial Intelligence in Magnetic Resonance Imaging (MRI): A Narrative Review.利用人工智能提高磁共振成像（MRI）中放射科医生的工作效率：一篇叙述性综述。

Diagnostics (Basel). 2025 Apr 30;15(9):1146. doi: 10.3390/diagnostics15091146.

External validation of SpineNetV2 on a comprehensive set of radiological features for grading lumbosacral disc pathologies.基于用于腰椎间盘病变分级的全面放射学特征集对SpineNetV2进行外部验证。

N Am Spine Soc J. 2024 Oct 26;20:100564. doi: 10.1016/j.xnsj.2024.100564. eCollection 2024 Dec.

Artificial Intelligence-Assisted MRI Diagnosis in Lumbar Degenerative Disc Disease: A Systematic Review.人工智能辅助磁共振成像诊断腰椎间盘退变疾病：一项系统综述

Global Spine J. 2025 Mar;15(2):1405-1418. doi: 10.1177/21925682241274372. Epub 2024 Aug 15.

Automated detection, labelling and radiological grading of clinical spinal MRIs.临床脊柱磁共振成像的自动检测、标注和放射学分级。

Sci Rep. 2024 Jul 1;14(1):14993. doi: 10.1038/s41598-024-64580-w.

Advancing spine care through AI and machine learning: overview and applications.通过人工智能和机器学习推动脊柱护理：概述与应用

EFORT Open Rev. 2024 May 10;9(5):422-433. doi: 10.1530/EOR-24-0019.

Comparing image normalization techniques in an end-to-end model for automated modic changes classification from MRI images.在用于从MRI图像自动分类Modic改变的端到端模型中比较图像归一化技术。

Brain Spine. 2023 Dec 23;4:102738. doi: 10.1016/j.bas.2023.102738. eCollection 2024.

Are current machine learning applications comparable to radiologist classification of degenerate and herniated discs and Modic change? A systematic review and meta-analysis.当前的机器学习应用程序是否可与放射科医生对退变和椎间盘突出以及 Modic 改变的分类相媲美？系统评价和荟萃分析。

Eur Spine J. 2023 Nov;32(11):3764-3787. doi: 10.1007/s00586-023-07718-0. Epub 2023 May 8.

本文引用的文献

Automated detection, labelling and radiological grading of clinical spinal MRIs.临床脊柱磁共振成像的自动检测、标注和放射学分级。

Sci Rep. 2024 Jul 1;14(1):14993. doi: 10.1038/s41598-024-64580-w.

Eur Spine J. 2022 Aug;31(8):2137-2148. doi: 10.1007/s00586-022-07311-x. Epub 2022 Jul 14.

Association of lumbar disc degeneration with low back pain in middle age in the Northern Finland Birth Cohort 1966.中年人群中腰椎间盘退变与下腰痛的相关性：芬兰 1966 年出生队列研究。

BMC Musculoskelet Disord. 2022 Apr 15;23(1):359. doi: 10.1186/s12891-022-05302-z.

Cohort Profile: 46 years of follow-up of the Northern Finland Birth Cohort 1966 (NFBC1966).队列简介：对1966年芬兰北部出生队列（NFBC1966）进行46年随访。

Int J Epidemiol. 2022 Jan 6;50(6):1786-1787j. doi: 10.1093/ije/dyab109. Epub 2021 Aug 29.

Detailed Subphenotyping of Lumbar Modic Changes and Their Association with Low Back Pain in a Large Population-Based Study: The Wakayama Spine Study.一项基于大样本人群的研究：和歌山脊柱研究中腰椎Modic改变的详细亚表型分析及其与腰痛的关联

Pain Ther. 2022 Mar;11(1):57-71. doi: 10.1007/s40122-021-00337-x. Epub 2021 Nov 15.

The importance of being external. methodological insights for the external validation of machine learning models in medicine.重视外部性。医学中机器学习模型外部验证的方法学见解。

Comput Methods Programs Biomed. 2021 Sep;208:106288. doi: 10.1016/j.cmpb.2021.106288. Epub 2021 Jul 22.

The need to separate the wheat from the chaff in medical informatics: Introducing a comprehensive checklist for the (self)-assessment of medical AI studies.需要在医学信息学中去芜存菁：引入全面的清单，用于（自我）评估医学人工智能研究。

Int J Med Inform. 2021 Sep;153:104510. doi: 10.1016/j.ijmedinf.2021.104510. Epub 2021 Jun 2.

A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis.深度学习在医学影像疾病检测方面的性能与医疗保健专业人员的比较：系统评价和荟萃分析。

Lancet Digit Health. 2019 Oct;1(6):e271-e297. doi: 10.1016/S2589-7500(19)30123-2. Epub 2019 Sep 25.

Intelligence-Based Spine Care Model: A New Era of Research and Clinical Decision-Making.基于智能的脊柱护理模式：研究与临床决策的新时代。

Global Spine J. 2021 Mar;11(2):135-145. doi: 10.1177/2192568220973984. Epub 2020 Nov 28.

A Deep Learning Model for the Accurate and Reliable Classification of Disc Degeneration Based on MRI Data.基于 MRI 数据的用于 discs 退变准确可靠分类的深度学习模型。

Invest Radiol. 2021 Feb 1;56(2):78-85. doi: 10.1097/RLI.0000000000000709.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于芬兰 1966 年出生队列的脊柱网络外部验证，一种用于腰椎间盘退变 MRI 特征分级的开源深度学习模型

External Validation of SpineNet, an Open-Source Deep Learning Model for Grading Lumbar Disk Degeneration MRI Features, Using the Northern Finland Birth Cohort 1966.

机构信息

出版信息

STUDY DESIGN

OBJECTIVE

SUMMARY OF DATA

MATERIALS AND METHODS

RESULTS

CONCLUSIONS

研究设计

目的

数据概要

材料和方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献