AP-HP, Department of Clinical Research, Saint-Louis Hospital, Paris, France.
BMC Med Res Methodol. 2012 Jul 26;12:107. doi: 10.1186/1471-2288-12-107.
With a large number of potentially relevant clinical indicators penalization and ensemble learning methods are thought to provide better predictive performance than usual linear predictors. However, little is known about how they perform in clinical studies where few cases are available. We used Random Forests and Partial Least Squares Discriminant Analysis to select the most salient impairments in Developmental Coordination Disorder (DCD) and assess patients similarity.
We considered a wide-range testing battery for various neuropsychological and visuo-motor impairments which aimed at characterizing subtypes of DCD in a sample of 63 children. Classifiers were optimized on a training sample, and they were used subsequently to rank the 49 items according to a permuted measure of variable importance. In addition, subtyping consistency was assessed with cluster analysis on the training sample. Clustering fitness and predictive accuracy were evaluated on the validation sample.
Both classifiers yielded a relevant subset of items impairments that altogether accounted for a sharp discrimination between three DCD subtypes: ideomotor, visual-spatial and constructional, and mixt dyspraxia. The main impairments that were found to characterize the three subtypes were: digital perception, imitations of gestures, digital praxia, lego blocks, visual spatial structuration, visual motor integration, coordination between upper and lower limbs. Classification accuracy was above 90% for all classifiers, and clustering fitness was found to be satisfactory.
Random Forests and Partial Least Squares Discriminant Analysis are useful tools to extract salient features from a large pool of correlated binary predictors, but also provide a way to assess individuals proximities in a reduced factor space. Less than 15 neuro-visual, neuro-psychomotor and neuro-psychological tests might be required to provide a sensitive and specific diagnostic of DCD on this particular sample, and isolated markers might be used to refine our understanding of DCD in future studies.
大量潜在相关的临床指标和集成学习方法被认为比通常的线性预测器提供更好的预测性能。然而,对于可用病例较少的临床研究中,它们的表现如何,我们知之甚少。我们使用随机森林和偏最小二乘判别分析来选择发育性协调障碍(DCD)中最显著的障碍,并评估患者的相似性。
我们考虑了一系列广泛的神经心理学和视动障碍测试,旨在对 63 名儿童样本中的 DCD 亚型进行特征描述。在训练样本上优化分类器,然后根据变量重要性的置换度量对 49 个项目进行排序。此外,在训练样本上进行聚类分析评估亚型一致性。在验证样本上评估聚类拟合度和预测准确性。
两种分类器都产生了一个相关的项目障碍子集,这些障碍共同导致了三种 DCD 亚型之间的明显区分:运动意向、视觉空间和结构型,以及混合性运动障碍。发现用于描述三种亚型的主要障碍包括:数字感知、手势模仿、数字动作、乐高积木、视觉空间结构、视觉运动整合、上下肢协调。所有分类器的分类准确率均在 90%以上,聚类拟合度也令人满意。
随机森林和偏最小二乘判别分析是从大量相关的二进制预测器中提取显著特征的有用工具,也提供了一种在简化的因子空间中评估个体相似性的方法。在这个特定的样本中,可能只需要不到 15 个神经视觉、神经心理运动和神经心理学测试,就可以提供 DCD 的敏感和特异性诊断,而孤立的标志物可能被用于在未来的研究中更深入地了解 DCD。