基于距离矩阵进行多元回归的模型选择会导致错误的推断。

Model selection with multiple regression on distance matrices leads to incorrect inferences.

作者信息

Franckowiak Ryan P, Panasci Michael, Jarvis Karl J, Acuña-Rodriguez Ian S, Landguth Erin L, Fortin Marie-Josée, Wagner Helene H

机构信息

Environmental & Life Sciences Graduate Program, Trent University, Peterborough, Ontario, Canada.

Department of Natural Resources Management, Texas Tech University, Lubbock, Texas, United States of America.

出版信息

PLoS One. 2017 Apr 13;12(4):e0175194. doi: 10.1371/journal.pone.0175194. eCollection 2017.

DOI:10.1371/journal.pone.0175194

PMID:28406923

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5390996/

Abstract

In landscape genetics, model selection procedures based on Information Theoretic and Bayesian principles have been used with multiple regression on distance matrices (MRM) to test the relationship between multiple vectors of pairwise genetic, geographic, and environmental distance. Using Monte Carlo simulations, we examined the ability of model selection criteria based on Akaike's information criterion (AIC), its small-sample correction (AICc), and the Bayesian information criterion (BIC) to reliably rank candidate models when applied with MRM while varying the sample size. The results showed a serious problem: all three criteria exhibit a systematic bias toward selecting unnecessarily complex models containing spurious random variables and erroneously suggest a high level of support for the incorrectly ranked best model. These problems effectively increased with increasing sample size. The failure of AIC, AICc, and BIC was likely driven by the inflated sample size and different sum-of-squares partitioned by MRM, and the resulting effect on delta values. Based on these findings, we strongly discourage the continued application of AIC, AICc, and BIC for model selection with MRM.

摘要

在景观遗传学中，基于信息论和贝叶斯原理的模型选择程序已与距离矩阵多元回归（MRM）一起用于检验成对遗传、地理和环境距离的多个向量之间的关系。通过蒙特卡罗模拟，我们研究了基于赤池信息准则（AIC）、其小样本校正（AICc）和贝叶斯信息准则（BIC）的模型选择标准在与MRM一起应用时，在改变样本量的情况下可靠地对候选模型进行排序的能力。结果显示了一个严重的问题：所有这三个标准都表现出一种系统偏差，倾向于选择包含虚假随机变量的不必要复杂模型，并错误地表明对排名错误的最佳模型有高度支持。这些问题随着样本量的增加而有效加剧。AIC、AICc和BIC的失败可能是由膨胀的样本量以及MRM划分的不同平方和驱动的，以及由此对增量值产生的影响。基于这些发现，我们强烈不鼓励继续将AIC、AICc和BIC用于MRM的模型选择。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

基于距离矩阵进行多元回归的模型选择会导致错误的推断。

Model selection with multiple regression on distance matrices leads to incorrect inferences.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

基于距离矩阵进行多元回归的模型选择会导致错误的推断。

Model selection with multiple regression on distance matrices leads to incorrect inferences.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献