Al-Fahad Rakib, Yeasin Mohammed, Glass John O, Conklin Heather M, Jacola Lisa M, Reddick Wilburn E
The University of Memphis, Memphis, Tennessee, USA.
St. Jude Children's Research Hospital, Memphis, Tennessee, USA.
IEEE Access. 2019;7:146662-146674. doi: 10.1109/access.2019.2946240. Epub 2019 Oct 8.
In the United States, Acute Lymphoblastic Leukemia (ALL), the most common child and adolescent malignancy, accounts for roughly 25% of childhood cancers diagnosed annually with a 5-year survival rate as high as 94% [1]. This improved survival rate comes with an increased risk for delayed neurocognitive effects in attention, working memory, and processing speed [2]. Predictive modeling and characterization of neurocognitive effects are critical to inform the family and also to identify patients for interventions targeting. Current state-of-the-art methods mainly use hypothesis-driven statistical testing methods to characterize and model such cognitive events. While these techniques have proven to be useful in understanding cognitive abilities, they are inadequate in explaining causal relationships, as well as individuality and variations. In this study, we developed multivariate data-driven models to measure the late neurocognitive effects of ALL patients using behavioral phenotypes, Diffusion Tensor Magnetic Resonance Imaging (DTI) based tractography data, morphometry statistics, tractography measures, behavioral, and demographic variables. Alongside conventional machine learning and graph mining, we adopted "Stability Selection" to select the most relevant features and choose models that are consistent over a range of parameters. The proposed approach demonstrated substantially improved accuracy (13% - 26%) over existing models and also yielded relevant features that were verified by domain experts.
在美国,急性淋巴细胞白血病(ALL)是儿童和青少年中最常见的恶性肿瘤,约占每年确诊的儿童癌症的25%,其5年生存率高达94%[1]。这种生存率的提高伴随着注意力、工作记忆和处理速度方面出现延迟神经认知效应的风险增加[2]。对神经认知效应进行预测建模和特征描述对于告知患者家属以及识别需要干预的患者至关重要。当前的先进方法主要使用假设驱动的统计测试方法来描述此类认知事件并进行建模。虽然这些技术已被证明在理解认知能力方面很有用,但它们在解释因果关系以及个体差异方面存在不足。在本研究中,我们开发了多变量数据驱动模型,使用行为表型、基于扩散张量磁共振成像(DTI)的纤维束成像数据、形态计量学统计、纤维束成像测量、行为和人口统计学变量来测量ALL患者的晚期神经认知效应。除了传统的机器学习和图挖掘方法,我们还采用了“稳定性选择”来选择最相关的特征,并选择在一系列参数范围内一致的模型。所提出的方法比现有模型的准确率大幅提高(13% - 26%),并且还产生了经领域专家验证的相关特征。