Newland M Christopher
Department of Psychology, Auburn University, Auburn, AL 36849 USA.
Perspect Behav Sci. 2019 Jun 19;42(3):583-616. doi: 10.1007/s40614-019-00206-1. eCollection 2019 Sep.
A reliance on null hypothesis significance testing (NHST) and misinterpretations of its results are thought to contribute to the replication crisis while impeding the development of a cumulative science. One solution is a data-analytic approach called Information-Theoretic (I-T) Model Selection, which builds upon Maximum Likelihood estimates. In the I-T approach, the scientist examines a set of candidate models and determines for each one the probability that it is the closer to the truth than all others in the set. Although the theoretical development is subtle, the implementation of I-T analysis is straightforward. Models are sorted according to the probability that they are the best in light of the data collected. It encourages the examination of multiple models, something investigators desire and that NHST discourages. This article is structured to address two objectives. The first is to illustrate the application of I-T data analysis to data from a virtual experiment. A noisy delay-discounting data set is generated and seven quantitative models are examined. In the illustration, it is demonstrated that it is not necessary to know the "truth" is to identify the one that is closest to it and that the most likely models conform to the model that generated the data. Second, we examine claims made by advocates of the I-T approach using Monte Carlo simulations in which 10,000 different data sets are generated and analyzed. The simulations showed that 1) the probabilities associated with each model returned by the single virtual experiment approximated those that resulted from the simulations, 2) models that were deemed close to the truth produced the most precise parameter estimates, and 3) adding a single replicate sharpens the ability to identify the most probable model.
对零假设显著性检验(NHST)的依赖及其结果的错误解读被认为是导致复制危机的原因,同时阻碍了累积性科学的发展。一种解决方案是一种称为信息论(I-T)模型选择的数据分析法,它建立在最大似然估计的基础上。在I-T方法中,科学家会检查一组候选模型,并确定每个模型比该组中所有其他模型更接近真相的概率。尽管理论发展较为微妙,但I-T分析的实施却很简单。模型根据它们根据所收集的数据是最佳模型的概率进行排序。它鼓励对多个模型进行检验,这是研究人员所期望的,而NHST却不鼓励这样做。本文旨在实现两个目标。第一个目标是说明I-T数据分析在虚拟实验数据中的应用。生成了一个有噪声的延迟折扣数据集,并检查了七个定量模型。在这个示例中,证明了不必知道“真相”就能识别最接近它的模型,并且最可能的模型与生成数据的模型相符。其次,我们使用蒙特卡罗模拟来检验I-T方法倡导者提出的主张,在模拟中生成并分析了10,000个不同的数据集。模拟结果表明:1)单个虚拟实验返回的每个模型的概率接近模拟结果产生的概率;2)被认为接近真相的模型产生了最精确的参数估计;3)添加单个重复样本增强了识别最可能模型的能力。