Zhang X, Tomblin J B
The University of Iowa, Department of Speech and Pathology, Iowa City, 52242, USA.
Brain Lang. 1998 Dec;65(3):395-403. doi: 10.1006/brln.1998.1999.
Three simulation experiments were conducted to determine the basis of the high predictive accuracy (98%) of temporal processing variables for the identification of language impairment obtained by Tallal, Stark, and Mellits (1985). In the first two experiments, the stepwise discriminant analysis using a set of 160 arrays of random numbers to predict a dichotomous language status (either normal or disordered) resulted in an average accuracy rate of 86.3% in contrast with the 98% rate obtained by Tallal, Stark, and Mellits. The third experiment showed that a 95% accuracy rate could be obtained from an array of 160 variables that each may only account for about 1.5% variance in the language ability. These results emphasize the need for confirmatory studies when large data sets are used to identify a small set of predictor variables.
进行了三项模拟实验,以确定塔拉尔、斯塔克和梅利茨(1985年)通过时间处理变量识别语言障碍所获得的高预测准确率(98%)的依据。在前两项实验中,使用一组160个随机数阵列进行逐步判别分析,以预测二分法语言状态(正常或紊乱),平均准确率为86.3%,而塔拉尔、斯塔克和梅利茨获得的准确率为98%。第三个实验表明,从一组160个变量中可以获得95%的准确率,每个变量在语言能力方面可能仅占约1.5%的方差。这些结果强调了在使用大数据集识别一小部分预测变量时进行验证性研究的必要性。