University of Pisa, Pisa, Italy; National Institute for Nuclear Physics (INFN), Pisa Division, Pisa, Italy.
Medical Physics Department, San Luca Hospital, 55100 Lucca, Italy.
Neuroimage Clin. 2022;35:103082. doi: 10.1016/j.nicl.2022.103082. Epub 2022 Jun 8.
Machine Learning (ML) techniques have been widely used in Neuroimaging studies of Autism Spectrum Disorders (ASD) both to identify possible brain alterations related to this condition and to evaluate the predictive power of brain imaging modalities. The collection and public sharing of large imaging samples has favored an even greater diffusion of the use of ML-based analyses. However, multi-center data collections may suffer the batch effect, which, especially in case of Magnetic Resonance Imaging (MRI) studies, should be curated to avoid confounding effects for ML classifiers and masking biases. This is particularly important in the study of barely separable populations according to MRI data, such as subjects with ASD compared to controls with typical development (TD). Here, we show how the implementation of a harmo- nization protocol on brain structural features unlocks the case-control ML separation capability in the analysis of a multi-center MRI dataset. This effect is demonstrated on the ABIDE data collection, involving subjects encompassing a wide age range. After data harmonization, the overall ASD vs. TD discrimination capability by a Random Forest (RF) classifier improves from a very low performance (AUC = 0.58 ± 0.04) to a still low, but reasonably significant AUC = 0.67 ± 0.03. The performances of the RF classifier have been evaluated also in the age-specific subgroups of children, adolescents and adults, obtaining AUC = 0.62 ± 0.02, AUC = 0.65 ± 0.03 and AUC = 0.69 ± 0.06, respectively. Specific and consistent patterns of anatomical differences related to the ASD condition have been identified for the three different age subgroups.
机器学习 (ML) 技术已广泛应用于自闭症谱系障碍 (ASD) 的神经影像学研究,以识别与该病症相关的可能的大脑变化,并评估脑成像方式的预测能力。大型成像样本的收集和公开共享促进了基于 ML 的分析的更广泛应用。然而,多中心数据采集可能会受到批次效应的影响,特别是在磁共振成像 (MRI) 研究的情况下,应该进行处理以避免对 ML 分类器产生混淆效应和掩盖偏差。这在根据 MRI 数据对几乎不可分离的人群(例如 ASD 患者与具有典型发育的对照组)进行研究时尤为重要。在这里,我们展示了如何在脑结构特征上实施协调协议,以解锁多中心 MRI 数据集分析中的病例对照 ML 分离能力。这一效果在包含广泛年龄范围的受试者的 ABIDE 数据集上得到了证明。在数据协调后,随机森林 (RF) 分类器对 ASD 与 TD 的整体区分能力从非常低的性能(AUC = 0.58 ± 0.04)提高到仍然较低但具有合理显著性的 AUC = 0.67 ± 0.03。RF 分类器的性能也在儿童、青少年和成年人的年龄特定亚组中进行了评估,分别获得 AUC = 0.62 ± 0.02、AUC = 0.65 ± 0.03 和 AUC = 0.69 ± 0.06。与 ASD 状况相关的解剖差异的特定和一致模式已在三个不同的年龄亚组中确定。