LewandrowskI Kai-Uwe, Muraleedharan Narendran, Eddy Steven Allen, Sobti Vikram, Reece Brian D, Ramírez León Jorge Felipe, Shah Sandeep
Staff Orthopaedic Spine Surgeon Center for Advanced Spine Care of Southern Arizona and Surgical Institute of Tucson, Tucson, Arizona.
Aptus Engineering, Inc, Scottsdale, Arizona, and Multus Medical, LLC, Phoenix, Arizona.
Int J Spine Surg. 2020 Dec;14(s3):S86-S97. doi: 10.14444/7131.
Artificial intelligence is gaining traction in automated medical imaging analysis. Development of more accurate magnetic resonance imaging (MRI) predictors of successful clinical outcomes is necessary to better define indications for surgery, improve clinical outcomes with targeted minimally invasive and endoscopic procedures, and realize cost savings by avoiding more invasive spine care.
To demonstrate the ability for deep learning neural network models to identify features in MRI DICOM datasets that represent varying intensities or severities of common spinal pathologies and injuries and to demonstrate the feasibility of generating automated verbal MRI reports comparable to those produced by reading radiologists.
A 3-dimensional (3D) anatomical model of the lumbar spine was fitted to each of the patient's MRIs by a team of technicians. MRI T1, T2, sagittal, axial, and transverse reconstruction image series were used to train segmentation models by the intersection of the 3D model through these image sequences. Class definitions were extracted from the radiologist report for the central canal: (0) no disc bulge/protrusion/canal stenosis, (1) disc bulge without canal stenosis, (2) disc bulge resulting in canal stenosis, and (3) disc herniation/protrusion/extrusion resulting in canal stenosis. Both the left and right neural foramina were assessed with either (0) neural foraminal stenosis absent, or (1) neural foramina stenosis present. Reporting criteria for the pathologies at each disc level and, when available, the grading of severity were extracted, and a natural language processing model was used to generate a verbal and written report. These data were then used to train a set of very deep convolutional neural network models, optimizing for minimal binary cross-entropy for each classification.
The initial prediction validation of the implemented deep learning algorithm was done on 20% of the dataset, which was not used for artificial intelligence training. Of the 17,800 total disc locations for which MRI images and radiology reports were available, 14,720 were used to train the model, and 3560 were used to validate against. The convergence of validation accuracy achieved with the deep learning algorithm for the foraminal stenosis detector was 81% (sensitivity = 72.4.4%, specificity = 83.1%) after 25 complete iterations through the entire training dataset (epoch). The accuracy was 86.2% (sensitivity = 91.1%, specificity = 82.5%) for the central stenosis detector and 85.2% (sensitivity = 81.8%, specificity = 87.4%) for the disc herniation detector.
Deep learning algorithms may be used for routine reporting in spine MRI. There was a minimal disparity among accuracy, sensitivity, and specificity, indicating that the data were not overfitted to the training set. We concluded that variability in the training data tends to reduce overfitting and overtraining as the deep neural network models learn to focus on the common pathologies. Future studies should demonstrate the accuracy of deep neural network models and the predictive value of favorable clinical outcomes with intervention and surgery.
Feasibility, clinical teaching, and evaluation study.
人工智能在自动化医学影像分析中越来越受到关注。开发更准确的磁共振成像(MRI)预测指标以实现成功的临床结果,对于更好地确定手术适应症、通过有针对性的微创和内窥镜手术改善临床结果以及通过避免更具侵入性的脊柱治疗实现成本节约而言是必要的。
证明深度学习神经网络模型能够识别MRI DICOM数据集中代表常见脊柱病变和损伤的不同强度或严重程度的特征,并证明生成与放射科医生所写报告相当的自动化MRI口头报告的可行性。
一组技术人员将腰椎的三维(3D)解剖模型与每位患者的MRI进行匹配。通过3D模型与这些图像序列的交叉来使用MRI T1、T2、矢状面、横断面和斜位重建图像序列训练分割模型。从放射科医生关于中央管的报告中提取类别定义:(0)无椎间盘膨出/突出/椎管狭窄,(1)无椎管狭窄的椎间盘膨出,(2)导致椎管狭窄的椎间盘膨出,以及(3)导致椎管狭窄的椎间盘突出/脱出/游离。对左侧和右侧神经孔均评估为(0)无神经孔狭窄或(1)存在神经孔狭窄。提取每个椎间盘水平病变的报告标准以及(如可用)严重程度分级,并使用自然语言处理模型生成口头和书面报告。然后将这些数据用于训练一组非常深的卷积神经网络模型,针对每个分类的最小二元交叉熵进行优化。
对实施的深度学习算法的初始预测验证在20%的数据集上进行,该数据集未用于人工智能训练。在有MRI图像和放射学报告的总共17800个椎间盘位置中,14720个用于训练模型,3560个用于验证。在对整个训练数据集(轮次)进行25次完整迭代后,用于神经孔狭窄检测器的深度学习算法实现的验证准确率收敛到81%(敏感性 = 72.4%,特异性 = 83.1%)。中央管狭窄检测器的准确率为86.2%(敏感性 = 91.1%,特异性 = 82.5%),椎间盘突出检测器的准确率为85.2%(敏感性 = 81.8%,特异性 = 87.4%)。
深度学习算法可用于脊柱MRI的常规报告。准确性、敏感性和特异性之间的差异极小,表明数据未过度拟合训练集。我们得出结论,随着深度神经网络模型学会关注常见病变,训练数据中的变异性倾向于减少过度拟合和过度训练。未来的研究应证明深度神经网络模型的准确性以及干预和手术带来良好临床结果的预测价值。
3级。
可行性、临床教学和评估研究。