Mayfield John D, Murtagh Ryan, Ciotti John, Robertson Derrick, Naqa Issam El
USF Health Department of Radiology, 2 Tampa General Circle, STC 6103, Tampa, FL, 33612, USA.
Department of Neurology, University of South Florida, Morsani College of Medicine, USF Multiple Sclerosis Center, 13330 USF Laurel Drive, Tampa, FL, 33612, USA.
J Imaging Inform Med. 2024 Dec;37(6):3231-3249. doi: 10.1007/s10278-024-01031-y. Epub 2024 Jun 13.
The majority of deep learning models in medical image analysis concentrate on single snapshot timepoint circumstances, such as the identification of current pathology on a given image or volume. This is often in contrast to the diagnostic methodology in radiology where presumed pathologic findings are correlated to prior studies and subsequent changes over time. For multiple sclerosis (MS), the current body of literature describes various forms of lesion segmentation with few studies analyzing disability progression over time. For the purpose of longitudinal time-dependent analysis, we propose a combinatorial analysis of a video vision transformer (ViViT) benchmarked against traditional recurrent neural network of Convolutional Neural Network-Long Short-Term Memory (CNN-LSTM) architectures and a hybrid Vision Transformer-LSTM (ViT-LSTM) to predict long-term disability based upon the Extended Disability Severity Score (EDSS). The patient cohort was procured from a two-site institution with 703 patients' multisequence, contrast-enhanced MRIs of the cervical spine between the years 2002 and 2023. Following a competitive performance analysis, a VGG-16-based CNN-LSTM was compared to ViViT with an ablation analysis to determine time-dependency of the models. The VGG16-LSTM predicted trinary classification of EDSS score in 6 years with 0.74 AUC versus the ViViT with 0.84 AUC (p-value < 0.001 per 5 × 2 cross-validation F-test) on an 80:20 hold-out testing split. However, the VGG16-LSTM outperformed ViViT when patients with only 2 years of MRIs (n = 94) (0.75 AUC versus 0.72 AUC, respectively). Exact EDSS classification was investigated for both models using both classification and regression strategies but showed collectively worse performance. Our experimental results demonstrate the ability of time-dependent deep learning models to predict disability in MS using trinary stratification of disability, mimicking clinical practice. Further work includes external validation and subsequent observational clinical trials.
医学图像分析中的大多数深度学习模型都专注于单一时点的情况,例如在给定图像或体数据上识别当前的病理状况。这通常与放射学的诊断方法形成对比,在放射学中,推测的病理发现会与先前的研究以及随时间的后续变化相关联。对于多发性硬化症(MS),当前的文献描述了各种形式的病变分割,很少有研究分析随时间的残疾进展情况。为了进行纵向时间依赖性分析,我们提出了一种组合分析方法,将视频视觉变换器(ViViT)与传统的卷积神经网络-长短期记忆(CNN-LSTM)架构的循环神经网络以及混合视觉变换器-长短期记忆(ViT-LSTM)进行基准测试,以基于扩展残疾严重程度评分(EDSS)预测长期残疾情况。患者队列来自一个两地机构,收集了2002年至2023年间703例患者的颈椎多序列、对比增强磁共振成像(MRI)。经过竞争性性能分析后,将基于VGG-16的CNN-LSTM与ViViT进行比较,并进行消融分析以确定模型的时间依赖性。在80:20的留出测试分割中,VGG16-LSTM预测6年后EDSS评分的三元分类的曲线下面积(AUC)为0.74,而ViViT的AUC为0.84(p值<0.001,每5×2交叉验证F检验)。然而,当仅使用2年MRI数据的患者(n = 94)时,VGG16-LSTM的表现优于ViViT(分别为0.75 AUC和0.72 AUC)。使用分类和回归策略对两个模型的EDSS精确分类进行了研究,但总体表现较差。我们的实验结果表明,时间依赖性深度学习模型能够使用残疾的三元分层来预测MS中的残疾情况,这与临床实践类似。进一步的工作包括外部验证和后续的观察性临床试验。