Suppr超能文献

通过自助聚合聚类和间隙统计对发育性阅读障碍儿童进行亚型分类:与双重缺陷假说的比较

Subtyping of children with developmental dyslexia via bootstrap aggregated clustering and the gap statistic: comparison with the double-deficit hypothesis.

作者信息

King Wayne M, Giess Sally A, Lombardino Linda J

机构信息

Department of Speech and Hearing Sciences, The Ohio State University, Columbus, OH 43210, USA.

出版信息

Int J Lang Commun Disord. 2007 Jan-Feb;42(1):77-95. doi: 10.1080/13682820600806680.

Abstract

BACKGROUND

The marked degree of heterogeneity in persons with developmental dyslexia has motivated the investigation of possible subtypes. Attempts have proceeded both from theoretical models of reading and the application of unsupervised learning (clustering) methods. Previous cluster analyses of data obtained from persons with reading disabilities have suffered from the inherent limitations of unsupervised learning methods. Specifically, the reliability and stability of cluster solutions have proven difficult to determine. Recent developments in the clustering literature have addressed these concerns by permitting checks on the internal validity of the solution. Resampling methods produce consistent groupings of the data independent of initialization effects, while the gap statistic provides a confidence measure for the determination of the optimal number of clusters present in the data. Combining these methods produces a robust data-driven classification that can be compared with theoretically based subtypes to produce better-informed models of developmental dyslexia.

AIMS

The present study is a novel application of resampling (bootstrap aggregating or bagging) methods and the gap statistic to the subtyping of children with developmental dyslexia. The specific aims of this study are: (1) to illustrate the use of bagging methods and the gap statistic in multivariate data obtained from children with developmental dyslexia; and (2) to compare the bagged clustering thresholded by the gap statistic against the predictions of the double-deficit hypothesis. The double-deficit hypothesis is a prominent theoretical model of developmental dyslexia, which predicts three subtypes: phonological, rate, and phonological-rate impaired readers.

METHODS & PROCEDURES: Three simulated data sets with known cluster structure were created to check the validity and illustrate the utility of the bagged clustering with the gap statistic in data with known structure. Subsequently, a clinical database of standardized test data (eight tests) from 93 children with developmental dyslexia was clustered using these methods. This procedure was repeated on a database of 93 children without reading disability matched for gender and age as a control. Finally, the clustering was repeated on the entire database of 186 participants. Cluster solutions were obtained for an increasing number of clusters (1-10) and were tested against the null hypothesis that no subtypes were present, i.e. the data represented a single cluster.

OUTCOMES & RESULTS: Four clusters were identified in the children with developmental dyslexia. There was no evidence of significant cluster structure in the children without dyslexia. Two clusters were identified when children with and without reading impairments were considered together. Among the participants with developmental dyslexia, there was evidence of a phonological-deficit cluster, a rapid-naming cluster, and a cluster showing both depressed phonological processing and rapid naming. These accounted for 73 of the 93 participants (78%). All three are predicted by the double-deficit hypothesis. The fourth cluster consisted of children with normal phonological and rapid naming ability incommensurate with their high verbal ability. An analysis of variance with post-hoc multiple comparisons demonstrated that the phonological, rapid-naming, and double deficit clusters did not differ significantly in age, but the fourth cluster was comprised of significantly older children. The mixed data set revealed two clusters. One cluster consisted almost entirely of the double-deficit and phonological subtypes. The other consisted of the participants without dyslexia and the children with dyslexia demonstrating either a single rapid naming deficit or standardized test scores in the normal range. A silhouette analysis indicated that the four-cluster solution for the children with developmental dyslexia was superior to the two-cluster solution obtained for the entire data set.

CONCLUSIONS

The study provides support for the presence of distinct subtypes in children with developmental dyslexia and for the double-deficit hypothesis. Specifically, this study finds three subtypes predicted by the double-deficit hypothesis without the assumption of an a priori theoretical model of reading. Taken together, these subtypes account for 79% of the participants with dyslexia. Further, the percentages of children in each subtype are in good agreement with previous studies. The participants in the subtype not predicted by the double-deficit hypothesis were significantly older than the other three groups. Recent advances in unsupervised learning can be expected to aid the improvement and refinement of the definition of developmental dyslexia. If reliable and consistent subtypes can be identified among persons with developmental dyslexia, it is reasonable to assume that diagnostic and intervention efforts will be greatly improved.

摘要

背景

发育性阅读障碍患者中显著的异质性程度促使人们对可能的亚型进行研究。研究尝试既来自阅读的理论模型,也来自无监督学习(聚类)方法的应用。先前对阅读障碍患者数据进行的聚类分析受到无监督学习方法固有局限性的影响。具体而言,聚类解决方案的可靠性和稳定性已被证明难以确定。聚类文献中的最新进展通过允许检查解决方案的内部有效性来解决这些问题。重采样方法产生与初始化效应无关的数据一致分组,而间隙统计量为确定数据中存在的最佳聚类数提供置信度度量。结合这些方法可产生强大的数据驱动分类,可与基于理论的亚型进行比较,以生成关于发育性阅读障碍的更明智模型。

目的

本研究是重采样(自助聚合或装袋)方法和间隙统计量在发育性阅读障碍儿童亚型分类中的新应用。本研究的具体目的是:(1)说明装袋方法和间隙统计量在从发育性阅读障碍儿童获得的多变量数据中的使用;(2)将由间隙统计量阈值化的装袋聚类与双重缺陷假说的预测进行比较。双重缺陷假说是发育性阅读障碍的一个突出理论模型,它预测了三种亚型:语音型、速度型和语音 - 速度受损型阅读者。

方法与程序

创建了三个具有已知聚类结构的模拟数据集,以检验装袋聚类与间隙统计量在具有已知结构的数据中的有效性并说明其效用。随后,使用这些方法对来自93名发育性阅读障碍儿童的标准化测试数据(八项测试)临床数据库进行聚类。在一个由93名性别和年龄匹配的无阅读障碍儿童组成的数据库上重复此过程作为对照。最后,在186名参与者的整个数据库上重复聚类。针对越来越多的聚类(1 - 10个)获得聚类解决方案,并针对不存在亚型的零假设进行检验,即数据代表单个聚类。

结果与结论

在发育性阅读障碍儿童中识别出四个聚类。在无阅读障碍儿童中没有明显聚类结构的证据。当将有和无阅读障碍的儿童一起考虑时,识别出两个聚类。在发育性阅读障碍参与者中,有证据表明存在语音缺陷聚类、快速命名聚类以及同时表现出语音处理能力下降和快速命名能力下降的聚类。这占93名参与者中的73名(78%)。双重缺陷假说预测了所有这三种聚类。第四个聚类由语音和快速命名能力正常但与其高语言能力不相称的儿童组成。方差分析及事后多重比较表明,语音型、快速命名型和双重缺陷聚类在年龄上没有显著差异,但第四个聚类由年龄显著较大的儿童组成。混合数据集揭示了两个聚类。一个聚类几乎完全由双重缺陷型和语音型亚型组成。另一个聚类由无阅读障碍的参与者以及表现出单一快速命名缺陷或标准化测试分数在正常范围内的阅读障碍儿童组成。轮廓分析表明,发育性阅读障碍儿童的四聚类解决方案优于整个数据集获得的二聚类解决方案。

结论

该研究为发育性阅读障碍儿童中存在不同亚型以及双重缺陷假说提供了支持。具体而言,本研究在不假设先验阅读理论模型的情况下发现了双重缺陷假说预测的三种亚型。这些亚型加起来占阅读障碍参与者的79%。此外,每个亚型中儿童的百分比与先前研究非常一致。双重缺陷假说未预测到的亚型中的参与者比其他三组年龄显著更大。无监督学习的最新进展有望有助于改进和完善发育性阅读障碍的定义。如果能够在发育性阅读障碍患者中识别出可靠且一致的亚型,那么合理的假设是诊断和干预工作将得到极大改善。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验