Franckowiak Ryan P, Panasci Michael, Jarvis Karl J, Acuña-Rodriguez Ian S, Landguth Erin L, Fortin Marie-Josée, Wagner Helene H
Environmental & Life Sciences Graduate Program, Trent University, Peterborough, Ontario, Canada.
Department of Natural Resources Management, Texas Tech University, Lubbock, Texas, United States of America.
PLoS One. 2017 Apr 13;12(4):e0175194. doi: 10.1371/journal.pone.0175194. eCollection 2017.
In landscape genetics, model selection procedures based on Information Theoretic and Bayesian principles have been used with multiple regression on distance matrices (MRM) to test the relationship between multiple vectors of pairwise genetic, geographic, and environmental distance. Using Monte Carlo simulations, we examined the ability of model selection criteria based on Akaike's information criterion (AIC), its small-sample correction (AICc), and the Bayesian information criterion (BIC) to reliably rank candidate models when applied with MRM while varying the sample size. The results showed a serious problem: all three criteria exhibit a systematic bias toward selecting unnecessarily complex models containing spurious random variables and erroneously suggest a high level of support for the incorrectly ranked best model. These problems effectively increased with increasing sample size. The failure of AIC, AICc, and BIC was likely driven by the inflated sample size and different sum-of-squares partitioned by MRM, and the resulting effect on delta values. Based on these findings, we strongly discourage the continued application of AIC, AICc, and BIC for model selection with MRM.
在景观遗传学中,基于信息论和贝叶斯原理的模型选择程序已与距离矩阵多元回归(MRM)一起用于检验成对遗传、地理和环境距离的多个向量之间的关系。通过蒙特卡罗模拟,我们研究了基于赤池信息准则(AIC)、其小样本校正(AICc)和贝叶斯信息准则(BIC)的模型选择标准在与MRM一起应用时,在改变样本量的情况下可靠地对候选模型进行排序的能力。结果显示了一个严重的问题:所有这三个标准都表现出一种系统偏差,倾向于选择包含虚假随机变量的不必要复杂模型,并错误地表明对排名错误的最佳模型有高度支持。这些问题随着样本量的增加而有效加剧。AIC、AICc和BIC的失败可能是由膨胀的样本量以及MRM划分的不同平方和驱动的,以及由此对增量值产生的影响。基于这些发现,我们强烈不鼓励继续将AIC、AICc和BIC用于MRM的模型选择。