Department of Computer Science, City, University of London, UK.
School of Psychology, University of Birmingham, UK.
Neuroimage Clin. 2024;43:103638. doi: 10.1016/j.nicl.2024.103638. Epub 2024 Jul 2.
Machine learning offers great potential for automated prediction of post-stroke symptoms and their response to rehabilitation. Major challenges for this endeavour include the very high dimensionality of neuroimaging data, the relatively small size of the datasets available for learning and interpreting the predictive features, as well as, how to effectively combine neuroimaging and tabular data (e.g. demographic information and clinical characteristics). This paper evaluates several solutions based on two strategies. The first is to use 2D images that summarise MRI scans. The second is to select key features that improve classification accuracy. Additionally, we introduce the novel approach of training a convolutional neural network (CNN) on images that combine regions-of-interests (ROIs) extracted from MRIs, with symbolic representations of tabular data. We evaluate a series of CNN architectures (both 2D and a 3D) that are trained on different representations of MRI and tabular data, to predict whether a composite measure of post-stroke spoken picture description ability is in the aphasic or non-aphasic range. MRI and tabular data were acquired from 758 English speaking stroke survivors who participated in the PLORAS study. Each participant was assigned to one of five different groups that were matched for initial severity of symptoms, recovery time, left lesion size and the months or years post-stroke that spoken description scores were collected. Training and validation were carried out on the first four groups. The fifth (lock-box/test set) group was used to test how well model accuracy generalises to new (unseen) data. The classification accuracy for a baseline logistic regression was 0.678 based on lesion size alone, rising to 0.757 and 0.813 when initial symptom severity and recovery time were successively added. The highest classification accuracy (0.854), area under the curve (0.899) and F1 score (0.901) were observed when 8 regions of interest were extracted from each MRI scan and combined with lesion size, initial severity and recovery time in a 2D Residual Neural Network (ResNet). This was also the best model when data were limited to the 286 participants with moderate or severe initial aphasia (with area under curve = 0.865), a group that would be considered more difficult to classify. Our findings demonstrate how imaging and tabular data can be combined to achieve high post-stroke classification accuracy, even when the dataset is small in machine learning terms. We conclude by proposing how the current models could be improved to achieve even higher levels of accuracy using images from hospital scanners.
机器学习在预测中风后症状及其对康复的反应方面具有很大的潜力。这项工作面临的主要挑战包括神经影像学数据的高度维度、用于学习和解释预测特征的可用数据集相对较小,以及如何有效地将神经影像学和表格数据(例如人口统计学信息和临床特征)结合起来。本文评估了基于两种策略的几种解决方案。第一种策略是使用概括 MRI 扫描的 2D 图像。第二种策略是选择可提高分类准确性的关键特征。此外,我们引入了一种新方法,即在从 MRI 中提取的感兴趣区域 (ROI) 与表格数据的符号表示相结合的情况下,对卷积神经网络 (CNN) 进行训练。我们评估了一系列 CNN 架构(二维和三维),这些架构是在不同的 MRI 和表格数据表示形式上进行训练的,以预测中风后口语描述能力综合测量值是否处于失语症或非失语症范围。MRI 和表格数据是从 758 名参加 PLORAS 研究的讲英语的中风幸存者中采集的。每位参与者都被分配到五个不同的组之一,这些组在症状初始严重程度、恢复时间、左侧病变大小以及收集口语描述分数的中风后月或年方面进行匹配。训练和验证是在前四个组上进行的。第五个(锁定箱/测试集)组用于测试模型准确性对新(未见过)数据的泛化程度。仅基于病变大小,基线逻辑回归的分类准确性为 0.678,当依次添加初始症状严重程度和恢复时间时,分类准确性提高到 0.757 和 0.813。当从每个 MRI 扫描中提取 8 个感兴趣区域并与病变大小、初始严重程度和恢复时间结合在二维残差神经网络 (ResNet) 中时,观察到最高的分类准确性(0.854)、曲线下面积(0.899)和 F1 得分(0.901)。当将数据限于初始严重程度为中度或重度失语症的 286 名参与者(曲线下面积为 0.865)时,该模型也是最佳模型,因为该组更难以分类。我们的研究结果表明,即使在机器学习方面数据集较小的情况下,也可以如何将成像和表格数据结合起来以实现较高的中风后分类准确性。最后,我们提出了如何使用来自医院扫描仪的图像来改进当前模型,以达到更高的准确性。