Department of Psychology, University of Miami, 5665 Ponce de Leon Blvd, Coral Gables, FL, 33146, USA.
Department of Psychology, University of California San Diego, San Diego, USA.
Psychometrika. 2023 Sep;88(3):1032-1055. doi: 10.1007/s11336-023-09914-9. Epub 2023 May 23.
In the current paper, we review existing tools for solving variable selection problems in psychology. Modern regularization methods such as lasso regression have recently been introduced in the field and are incorporated into popular methodologies, such as network analysis. However, several recognized limitations of lasso regularization may limit its suitability for psychological research. In this paper, we compare the properties of lasso approaches used for variable selection to Bayesian variable selection approaches. In particular we highlight advantages of stochastic search variable selection (SSVS), that make it well suited for variable selection applications in psychology. We demonstrate these advantages and contrast SSVS with lasso type penalization in an application to predict depression symptoms in a large sample and an accompanying simulation study. We investigate the effects of sample size, effect size, and patterns of correlation among predictors on rates of correct and false inclusion and bias in the estimates. SSVS as investigated here is reasonably computationally efficient and powerful to detect moderate effects in small sample sizes (or small effects in moderate sample sizes), while protecting against false inclusion and without over-penalizing true effects. We recommend SSVS as a flexible framework that is well-suited for the field, discuss limitations, and suggest directions for future development.
在当前的论文中,我们回顾了现有的解决心理学中变量选择问题的工具。最近,现代正则化方法(如套索回归)已被引入该领域,并被纳入流行的方法,如网络分析。然而,套索正则化的几个公认的局限性可能限制其在心理学研究中的适用性。在本文中,我们将用于变量选择的套索方法的特性与贝叶斯变量选择方法进行了比较。特别是,我们强调了随机搜索变量选择(SSVS)的优势,使其非常适合心理学中的变量选择应用。我们通过在一个大样本中预测抑郁症状的应用和相应的模拟研究来展示这些优势,并将 SSVS 与套索类型惩罚进行对比。我们研究了样本量、效应大小以及预测因子之间相关性模式对正确和错误纳入以及估计偏差的影响。这里研究的 SSVS 在小样本量(或中等样本量中的小效应)中检测中等效应时具有相当合理的计算效率和强大功能,同时能防止错误纳入且不会过度惩罚真实效应。我们建议将 SSVS 作为一个灵活的框架,非常适合该领域,并讨论了局限性和未来发展方向。