深度学习辅助膝关节磁共振成像诊断：MRNet 的开发和回顾性验证。

Deep-learning-assisted diagnosis for knee magnetic resonance imaging: Development and retrospective validation of MRNet.

机构信息

Department of Computer Science, Stanford University, Stanford, California, United States of America.

Quantitative Sciences Unit, Department of Medicine, Stanford University, Stanford, California, United States of America.

出版信息

PLoS Med. 2018 Nov 27;15(11):e1002699. doi: 10.1371/journal.pmed.1002699. eCollection 2018 Nov.

DOI:10.1371/journal.pmed.1002699

PMID:30481176

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6258509/

Abstract

BACKGROUND

Magnetic resonance imaging (MRI) of the knee is the preferred method for diagnosing knee injuries. However, interpretation of knee MRI is time-intensive and subject to diagnostic error and variability. An automated system for interpreting knee MRI could prioritize high-risk patients and assist clinicians in making diagnoses. Deep learning methods, in being able to automatically learn layers of features, are well suited for modeling the complex relationships between medical images and their interpretations. In this study we developed a deep learning model for detecting general abnormalities and specific diagnoses (anterior cruciate ligament [ACL] tears and meniscal tears) on knee MRI exams. We then measured the effect of providing the model's predictions to clinical experts during interpretation.

METHODS AND FINDINGS

Our dataset consisted of 1,370 knee MRI exams performed at Stanford University Medical Center between January 1, 2001, and December 31, 2012 (mean age 38.0 years; 569 [41.5%] female patients). The majority vote of 3 musculoskeletal radiologists established reference standard labels on an internal validation set of 120 exams. We developed MRNet, a convolutional neural network for classifying MRI series and combined predictions from 3 series per exam using logistic regression. In detecting abnormalities, ACL tears, and meniscal tears, this model achieved area under the receiver operating characteristic curve (AUC) values of 0.937 (95% CI 0.895, 0.980), 0.965 (95% CI 0.938, 0.993), and 0.847 (95% CI 0.780, 0.914), respectively, on the internal validation set. We also obtained a public dataset of 917 exams with sagittal T1-weighted series and labels for ACL injury from Clinical Hospital Centre Rijeka, Croatia. On the external validation set of 183 exams, the MRNet trained on Stanford sagittal T2-weighted series achieved an AUC of 0.824 (95% CI 0.757, 0.892) in the detection of ACL injuries with no additional training, while an MRNet trained on the rest of the external data achieved an AUC of 0.911 (95% CI 0.864, 0.958). We additionally measured the specificity, sensitivity, and accuracy of 9 clinical experts (7 board-certified general radiologists and 2 orthopedic surgeons) on the internal validation set both with and without model assistance. Using a 2-sided Pearson's chi-squared test with adjustment for multiple comparisons, we found no significant differences between the performance of the model and that of unassisted general radiologists in detecting abnormalities. General radiologists achieved significantly higher sensitivity in detecting ACL tears (p-value = 0.002; q-value = 0.019) and significantly higher specificity in detecting meniscal tears (p-value = 0.003; q-value = 0.019). Using a 1-tailed t test on the change in performance metrics, we found that providing model predictions significantly increased clinical experts' specificity in identifying ACL tears (p-value < 0.001; q-value = 0.006). The primary limitations of our study include lack of surgical ground truth and the small size of the panel of clinical experts.

CONCLUSIONS

Our deep learning model can rapidly generate accurate clinical pathology classifications of knee MRI exams from both internal and external datasets. Moreover, our results support the assertion that deep learning models can improve the performance of clinical experts during medical imaging interpretation. Further research is needed to validate the model prospectively and to determine its utility in the clinical setting.

摘要

背景

磁共振成像（MRI）是诊断膝关节损伤的首选方法。然而，膝关节 MRI 的解读需要耗费大量时间，并且容易出现诊断错误和变异性。一个用于解读膝关节 MRI 的自动化系统可以优先考虑高危患者，并帮助临床医生做出诊断。深度学习方法能够自动学习多层特征，非常适合建立医学图像与其解读之间的复杂关系模型。在这项研究中，我们开发了一个用于检测膝关节 MRI 检查中一般异常和特定诊断（前交叉韧带[ACL]撕裂和半月板撕裂）的深度学习模型。然后，我们测量了在解读过程中向临床专家提供模型预测的效果。

方法和发现

我们的数据集包括斯坦福大学医疗中心 2001 年 1 月 1 日至 2012 年 12 月 31 日期间进行的 1370 次膝关节 MRI 检查（平均年龄 38.0 岁；569 名[41.5%]女性患者）。3 位肌肉骨骼放射科医生对内部验证集的 120 次检查进行了多数投票，建立了参考标准标签。我们开发了 MRNet，这是一种用于对 MRI 系列进行分类的卷积神经网络，我们使用逻辑回归对每次检查的 3 个系列的预测进行了组合。在检测异常、ACL 撕裂和半月板撕裂方面，该模型在内部验证集上的曲线下接收器工作特征（AUROC）值分别为 0.937（95%CI 0.895，0.980）、0.965（95%CI 0.938，0.993）和 0.847（95%CI 0.780，0.914）。我们还从克罗地亚里耶卡临床医院中心获得了一个包含 917 次检查和矢状 T1 加权系列以及 ACL 损伤标签的公共数据集。在 183 次外部验证检查中，基于斯坦福矢状 T2 加权系列训练的 MRNet 在未进行额外训练的情况下，ACL 损伤检测的 AUC 为 0.824（95%CI 0.757，0.892），而基于其余外部数据训练的 MRNet 的 AUC 为 0.911（95%CI 0.864，0.958）。我们还分别测量了 9 位临床专家（7 位放射科认证专家和 2 位骨科医生）在内部验证集上的特异性、敏感性和准确性，既有无模型辅助，也有模型辅助。使用双侧 Pearson 卡方检验并进行多次比较调整，我们发现模型与未经辅助的放射科专家在检测异常方面的表现之间没有显著差异。放射科专家在检测 ACL 撕裂时的敏感性显著提高（p 值=0.002；q 值=0.019），在检测半月板撕裂时的特异性显著提高（p 值=0.003；q 值=0.019）。通过对性能指标变化的单侧 t 检验，我们发现提供模型预测显著提高了临床专家识别 ACL 撕裂的特异性（p 值<0.001；q 值=0.006）。我们研究的主要局限性包括缺乏手术金标准和临床专家小组的规模较小。

结论

我们的深度学习模型可以从内部和外部数据集快速生成膝关节 MRI 检查的准确临床病理分类。此外，我们的结果支持深度学习模型可以提高医学成像解读过程中临床专家表现的说法。需要进一步研究来前瞻性验证该模型，并确定其在临床环境中的实用性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d595/6258509/65b654a76393/pmed.1002699.g001.jpg

相似文献

Deep-learning-assisted diagnosis for knee magnetic resonance imaging: Development and retrospective validation of MRNet.

PLoS Med. 2018 Nov 27;15(11):e1002699. doi: 10.1371/journal.pmed.1002699. eCollection 2018 Nov.

Deep learning to detect anterior cruciate ligament tear on knee MRI: multi-continental external validation.

Eur Radiol. 2022 Dec;32(12):8394-8403. doi: 10.1007/s00330-022-08923-z. Epub 2022 Jun 21.

Automatic Detection of Meniscus Tears Using Backbone Convolutional Neural Networks on Knee MRI.

J Magn Reson Imaging. 2023 Mar;57(3):740-749. doi: 10.1002/jmri.28284. Epub 2022 Jun 1.

Meniscal lesion detection and characterization in adult knee MRI: A deep learning model approach with external validation.

Phys Med. 2021 Mar;83:64-71. doi: 10.1016/j.ejmp.2021.02.010. Epub 2021 Mar 11.

Deep Learning Approach for Anterior Cruciate Ligament Lesion Detection: Evaluation of Diagnostic Performance Using Arthroscopy as the Reference Standard.

J Magn Reson Imaging. 2020 Dec;52(6):1745-1752. doi: 10.1002/jmri.27266. Epub 2020 Jul 26.

Comparison of clinical, MRI and arthroscopic assessments of chronic ACL injuries, meniscal tears and cartilage defects.

Musculoskelet Surg. 2016 Dec;100(3):231-238. doi: 10.1007/s12306-016-0427-y. Epub 2016 Sep 14.

Accuracy of MRI evaluation of meniscus tears in the setting of ACL injuries.

Knee. 2016 Jun;23(3):460-4. doi: 10.1016/j.knee.2016.01.018. Epub 2016 Feb 22.

Semi-automated detection of anterior cruciate ligament injury from MRI.

Comput Methods Programs Biomed. 2017 Mar;140:151-164. doi: 10.1016/j.cmpb.2016.12.006. Epub 2016 Dec 15.

What Is the Diagnostic Accuracy of the Duck Walk Test in Detecting Meniscal Tears?

Clin Orthop Relat Res. 2017 Dec;475(12):2963-2969. doi: 10.1007/s11999-017-5475-6. Epub 2017 Aug 14.

Deep Convolutional Neural Network-Based Diagnosis of Anterior Cruciate Ligament Tears: Performance Comparison of Homogenous Versus Heterogeneous Knee MRI Cohorts With Different Pulse Sequence Protocols and 1.5-T and 3-T Magnetic Field Strengths.

Invest Radiol. 2020 Aug;55(8):499-506. doi: 10.1097/RLI.0000000000000664.

引用本文的文献

MV2SwimNet: A lightweight transformer-based hybrid model for knee meniscus tears detection.

PLoS One. 2025 Aug 27;20(8):e0330444. doi: 10.1371/journal.pone.0330444. eCollection 2025.

A multimodal deep learning radiomics model for predicting degenerative meniscus tear after arthroscopy.

PLoS One. 2025 Aug 13;20(8):e0328299. doi: 10.1371/journal.pone.0328299. eCollection 2025.

Current imaging applications, radiomics, and machine learning modalities of CNS demyelinating disorders and its mimickers.

J Neurol. 2025 Aug 12;272(9):568. doi: 10.1007/s00415-025-13253-3.

Federated knee injury diagnosis using few shot learning.

Front Artif Intell. 2025 Jul 23;8:1589358. doi: 10.3389/frai.2025.1589358. eCollection 2025.

Artificial intelligence in orthopedics: fundamentals, current applications, and future perspectives.

Mil Med Res. 2025 Aug 4;12(1):42. doi: 10.1186/s40779-025-00633-z.

Impact of AI assistance on radiologist interpretation of knee MRI.

Eur Radiol. 2025 Jul 31. doi: 10.1007/s00330-025-11820-w.

Exploration of 3D Few-Shot Learning Techniques for Classification of Knee Joint Injuries on MR Images.

Diagnostics (Basel). 2025 Jul 18;15(14):1808. doi: 10.3390/diagnostics15141808.

A Geometric Deep Learning Model for Real-Time Prediction of Knee Joint Biomechanics Under Meniscal Extrusion.

Ann Biomed Eng. 2025 Jul 15. doi: 10.1007/s10439-025-03798-9.

A Multi-view Open-access Dataset of Paired Knee MRI for Motion Artifact Removal.

Sci Data. 2025 Jul 9;12(1):1173. doi: 10.1038/s41597-025-05439-1.

Medical slice transformer for improved diagnosis and explainability on 3D medical images with DINOv2.

Sci Rep. 2025 Jul 4;15(1):23979. doi: 10.1038/s41598-025-09041-8.

本文引用的文献

Deep Learning Approach for Evaluating Knee MR Images: Achieving High Diagnostic Performance for Cartilage Lesion Detection.

Radiology. 2018 Oct;289(1):160-169. doi: 10.1148/radiol.2018172986. Epub 2018 Jul 31.

A survey on deep learning in medical image analysis.

Med Image Anal. 2017 Dec;42:60-88. doi: 10.1016/j.media.2017.07.005. Epub 2017 Jul 26.

Precision Radiology: Predicting longevity using feature engineering and deep learning methods in a radiomics framework.

Sci Rep. 2017 May 10;7(1):1648. doi: 10.1038/s41598-017-01931-w.

Semi-automated detection of anterior cruciate ligament injury from MRI.

Comput Methods Programs Biomed. 2017 Mar;140:151-164. doi: 10.1016/j.cmpb.2016.12.006. Epub 2016 Dec 15.

Magnetic resonance imaging of the knee: An overview and update of conventional and state of the art imaging.

J Magn Reson Imaging. 2017 May;45(5):1257-1275. doi: 10.1002/jmri.25620. Epub 2017 Feb 17.

Dermatologist-level classification of skin cancer with deep neural networks.

Nature. 2017 Feb 2;542(7639):115-118. doi: 10.1038/nature21056. Epub 2017 Jan 25.

Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs.

JAMA. 2016 Dec 13;316(22):2402-2410. doi: 10.1001/jama.2016.17216.

Imaging of Athletic Injuries of Knee Ligaments and Menisci: Sports Imaging Series.

Radiology. 2016 Oct;281(1):23-40. doi: 10.1148/radiol.2016152320.

Deep learning.

Nature. 2015 May 28;521(7553):436-44. doi: 10.1038/nature14539.

MR imaging-based diagnosis and classification of meniscal tears.

Radiographics. 2014 Jul-Aug;34(4):981-99. doi: 10.1148/rg.344125202.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

深度学习辅助膝关节磁共振成像诊断：MRNet 的开发和回顾性验证。

Deep-learning-assisted diagnosis for knee magnetic resonance imaging: Development and retrospective validation of MRNet.

机构信息

Department of Computer Science, Stanford University, Stanford, California, United States of America.

Quantitative Sciences Unit, Department of Medicine, Stanford University, Stanford, California, United States of America.