• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

具有缺失数据的多层模型的协变量选择

Covariate Selection for Multilevel Models with Missing Data.

作者信息

Marino Miguel, Buxton Orfeu M, Li Yi

机构信息

Department of Family Medicine, Department of Public Health, Division of Biostatistics, Oregon Health and Science University, Portland, OR 97239 USA.

Associate Professor, Department of Biobehavioral Health, Pennsylvania State University, University Park, PA 16802. Lecturer on Medicine, Division of Sleep Medicine, Harvard Medical School, Boston, MA 02115. Associate Neuroscientist, Department of Medicine, Brigham and Women's Hospital, Boston, MA 02115. Adjunct Associate Professor, Department of Social and Behavioral Sciences, Harvard T.H. Chan School of Public Health, Boston, MA 02115.

出版信息

Stat (Int Stat Inst). 2017;6(1):31-46. doi: 10.1002/sta4.133. Epub 2017 Jan 8.

DOI:10.1002/sta4.133
PMID:28239457
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5323238/
Abstract

Missing covariate data hampers variable selection in multilevel regression settings. Current variable selection techniques for multiply-imputed data commonly address missingness in the predictors through list-wise deletion and stepwise-selection methods which are problematic. Moreover, most variable selection methods are developed for independent linear regression models and do not accommodate multilevel mixed effects regression models with incomplete covariate data. We develop a novel methodology that is able to perform covariate selection across multiply-imputed data for multilevel random effects models when missing data is present. Specifically, we propose to stack the multiply-imputed data sets from a multiple imputation procedure and to apply a group variable selection procedure through group lasso regularization to assess the overall impact of each predictor on the outcome across the imputed data sets. Simulations confirm the advantageous performance of the proposed method compared with the competing methods. We applied the method to reanalyze the Healthy Directions-Small Business cancer prevention study, which evaluated a behavioral intervention program targeting multiple risk-related behaviors in a working-class, multi-ethnic population.

摘要

缺失的协变量数据会妨碍多级回归设置中的变量选择。当前用于多重填补数据的变量选择技术通常通过存在问题的逐行删除和逐步选择方法来处理预测变量中的缺失值。此外,大多数变量选择方法是为独立线性回归模型开发的,不适用于具有不完整协变量数据的多级混合效应回归模型。我们开发了一种新颖的方法,当存在缺失数据时,该方法能够对多级随机效应模型的多重填补数据进行协变量选择。具体而言,我们建议将多重填补程序中的多重填补数据集堆叠起来,并通过组套索正则化应用组变量选择程序,以评估每个预测变量对整个填补数据集结果的总体影响。模拟结果证实了所提出的方法与竞争方法相比具有优势性能。我们应用该方法重新分析了健康方向 - 小企业癌症预防研究,该研究评估了一项针对工人阶级多民族人群中多种与风险相关行为的行为干预计划。

相似文献

1
Covariate Selection for Multilevel Models with Missing Data.具有缺失数据的多层模型的协变量选择
Stat (Int Stat Inst). 2017;6(1):31-46. doi: 10.1002/sta4.133. Epub 2017 Jan 8.
2
Variable selection for multiply-imputed data with application to dioxin exposure study.具有应用于二恶英暴露研究的多重插补数据的变量选择。
Stat Med. 2013 Sep 20;32(21):3646-59. doi: 10.1002/sim.5783. Epub 2013 Mar 25.
3
Analyzing evidence-based falls prevention data with significant missing information using variable selection after multiple imputation.在多次插补后使用变量选择分析存在大量缺失信息的循证预防跌倒数据。
J Appl Stat. 2021 Oct 7;50(3):724-743. doi: 10.1080/02664763.2021.1985090. eCollection 2023.
4
How should variable selection be performed with multiply imputed data?对于多重填补的数据,应如何进行变量选择?
Stat Med. 2008 Jul 30;27(17):3227-46. doi: 10.1002/sim.3177.
5
A flexible approach for variable selection in large-scale healthcare database studies with missing covariate and outcome data.一种用于大规模医疗保健数据库研究中变量选择的灵活方法,该研究存在协变量和结果数据缺失的情况。
BMC Med Res Methodol. 2022 May 4;22(1):132. doi: 10.1186/s12874-022-01608-7.
6
The development and validation of prognostic models for overall survival in the presence of missing data in the training dataset: a strategy with a detailed example.训练数据集中存在缺失数据时总生存预后模型的开发与验证:一个详细示例的策略
Diagn Progn Res. 2021 Aug 4;5(1):14. doi: 10.1186/s41512-021-00103-9.
7
Combining multiple imputation and meta-analysis with individual participant data.结合多重插补和个体参与者数据的荟萃分析。
Stat Med. 2013 Nov 20;32(26):4499-514. doi: 10.1002/sim.5844. Epub 2013 May 24.
8
How to apply variable selection machine learning algorithms with multiply imputed data: A missing discussion.如何在多重插补数据中应用变量选择机器学习算法:一个缺失的讨论。
Psychol Methods. 2023 Apr;28(2):452-471. doi: 10.1037/met0000478. Epub 2022 Feb 3.
9
Model selection of generalized estimating equations with multiply imputed longitudinal data.具有多重填补纵向数据的广义估计方程的模型选择
Biom J. 2013 Nov;55(6):899-911. doi: 10.1002/bimj.201200236. Epub 2013 Aug 23.
10
Imputation and variable selection in linear regression models with missing covariates.具有缺失协变量的线性回归模型中的插补和变量选择
Biometrics. 2005 Jun;61(2):498-506. doi: 10.1111/j.1541-0420.2005.00317.x.

引用本文的文献

1
Developing clinical prediction models: a step-by-step guide.临床预测模型的建立:分步指南。
BMJ. 2024 Sep 3;386:e078276. doi: 10.1136/bmj-2023-078276.
2
Multi-omics regulatory network inference in the presence of missing data.存在缺失数据时的多组学调控网络推断。
Brief Bioinform. 2023 Sep 20;24(5). doi: 10.1093/bib/bbad309.
3
How to apply variable selection machine learning algorithms with multiply imputed data: A missing discussion.如何在多重插补数据中应用变量选择机器学习算法:一个缺失的讨论。
Psychol Methods. 2023 Apr;28(2):452-471. doi: 10.1037/met0000478. Epub 2022 Feb 3.

本文引用的文献

1
The prevention and treatment of missing data in clinical trials.临床试验中缺失数据的预防与处理
N Engl J Med. 2012 Oct 4;367(14):1355-60. doi: 10.1056/NEJMsr1203730.
2
A Perturbation Method for Inference on Regularized Regression Estimates.一种用于正则化回归估计推断的摄动方法。
J Am Stat Assoc. 2011 Jan 1;106(496):1371-1382. doi: 10.1198/jasa.2011.tm10382. Epub 2012 Jan 24.
3
Multiple imputation using chained equations: Issues and guidance for practice.使用链式方程进行多重插补:实践中的问题和指导。
Stat Med. 2011 Feb 20;30(4):377-99. doi: 10.1002/sim.4067. Epub 2010 Nov 30.
4
Fixed and random effects selection in mixed effects models.混合效应模型中的固定效应和随机效应选择
Biometrics. 2011 Jun;67(2):495-503. doi: 10.1111/j.1541-0420.2010.01463.x. Epub 2010 Jul 21.
5
Penalized Estimating Functions and Variable Selection in Semiparametric Regression Models.半参数回归模型中的惩罚估计函数与变量选择
J Am Stat Assoc. 2008 Jun 1;103(482):672-680. doi: 10.1198/016214508000000184.
6
Inference after variable selection using restricted permutation methods.使用受限排列方法进行变量选择后的推断。
Can J Stat. 2009 Dec 1;37(4):625-644. doi: 10.1002/cjs.10039.
7
VARIABLE SELECTION FOR REGRESSION MODELS WITH MISSING DATA.针对存在缺失数据的回归模型的变量选择
Stat Sin. 2010 Jan;20(1):149-165.
8
Joint variable selection for fixed and random effects in linear mixed-effects models.线性混合效应模型中固定效应和随机效应的联合变量选择
Biometrics. 2010 Dec;66(4):1069-77. doi: 10.1111/j.1541-0420.2010.01391.x.
9
Model Selection Criteria for Missing-Data Problems Using the EM Algorithm.使用期望最大化(EM)算法解决缺失数据问题的模型选择标准。
J Am Stat Assoc. 2008 Dec 1;103(484):1648-1658. doi: 10.1198/016214508000001057.
10
Variable selection for semiparametric mixed models in longitudinal studies.纵向研究中半参数混合模型的变量选择
Biometrics. 2010 Mar;66(1):79-88. doi: 10.1111/j.1541-0420.2009.01240.x. Epub 2009 Apr 13.