协变量包含误差时的线性模型选择

Linear Model Selection when Covariates Contain Errors.

作者信息

Zhang Xinyu, Wang Haiying, Ma Yanyuan, Carroll Raymond J

机构信息

Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China,

Department of Mathematics and Statistics, University of New Hampshire, Durham, NH 03824,

出版信息

J Am Stat Assoc. 2017;112(520):1553-1561. doi: 10.1080/01621459.2016.1219262. Epub 2017 Jun 29.

DOI:10.1080/01621459.2016.1219262

PMID:29416191

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5798903/

Abstract

Prediction precision is arguably the most relevant criterion of a model in practice and is often a sought after property. A common difficulty with covariates measured with errors is the impossibility of performing prediction evaluation on the data even if a model is completely given without any unknown parameters. We bypass this inherent difficulty by using special properties on moment relations in linear regression models with measurement errors. The end product is a model selection procedure that achieves the same optimality properties that are achieved in classical linear regression models without covariate measurement error. Asymptotically, the procedure selects the model with the minimum prediction error in general, and selects the smallest correct model if the regression relation is indeed linear. Our model selection procedure is useful in prediction when future covariates without measurement error become available, e.g., due to improved technology or better management and design of data collection procedures.

摘要

预测精度可以说是实践中模型最相关的标准，并且常常是人们所追求的属性。对于存在测量误差的协变量，一个常见的困难是，即使模型完全给定且没有任何未知参数，也无法对数据进行预测评估。我们通过利用具有测量误差的线性回归模型中矩关系的特殊性质来绕过这一固有困难。最终产物是一种模型选择程序，它能实现与没有协变量测量误差的经典线性回归模型相同的最优性质。渐近地，该程序通常会选择具有最小预测误差的模型，如果回归关系确实是线性的，则会选择最小的正确模型。当未来没有测量误差的协变量可用时，例如由于技术改进或数据收集程序的管理和设计更好，我们的模型选择程序在预测中很有用。

相似文献

Linear Model Selection when Covariates Contain Errors.协变量包含误差时的线性模型选择

J Am Stat Assoc. 2017;112(520):1553-1561. doi: 10.1080/01621459.2016.1219262. Epub 2017 Jun 29.

MALMEM: model averaging in linear measurement error models.MALMEM：线性测量误差模型中的模型平均法

J R Stat Soc Series B Stat Methodol. 2019 Sep;81(4):763-779. doi: 10.1111/rssb.12317. Epub 2019 Jun 2.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区，服用抗叶酸抗疟药物的人群中，叶酸补充剂与疟疾易感性和严重程度的关系。

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

Simultaneous treatment of unspecified heteroskedastic model error distribution and mismeasured covariates for restricted moment models.受限矩模型中未指定的异方差模型误差分布与测量误差协变量的同时处理

J Econom. 2017 Oct;200(2):194-206. doi: 10.1016/j.jeconom.2017.06.005. Epub 2017 Jul 8.

MEBoost: Variable selection in the presence of measurement error.MEBoost：存在测量误差时的变量选择。

Stat Med. 2019 Jul 10;38(15):2705-2718. doi: 10.1002/sim.8130. Epub 2019 Mar 11.

Identification and Estimation of Nonlinear Models Using Two Samples with Nonclassical Measurement Errors.使用具有非经典测量误差的两个样本进行非线性模型的识别与估计

J Nonparametr Stat. 2010 May 1;22(4):379-399. doi: 10.1080/10485250902874688.

Variable Selection for Partially Linear Models with Measurement Errors.含测量误差的部分线性模型的变量选择

J Am Stat Assoc. 2009;104(485):234-248. doi: 10.1198/jasa.2009.0127.

Regression calibration to correct correlated errors in outcome and exposure.回归校准以纠正结局和暴露中相关的误差。

Stat Med. 2021 Jan 30;40(2):271-286. doi: 10.1002/sim.8773. Epub 2020 Oct 21.

A general algorithm for error-in-variables regression modelling using Monte Carlo expectation maximization.使用蒙特卡罗期望最大化的变量误差回归建模的通用算法。

PLoS One. 2023 Apr 3;18(4):e0283798. doi: 10.1371/journal.pone.0283798. eCollection 2023.

Variable Selection in Measurement Error Models.测量误差模型中的变量选择

Bernoulli (Andover). 2010;16(1):274-300. doi: 10.3150/09-bej205.

引用本文的文献

Conformal normal curvature and detection of masked observations in multivariate null intercept measurement error models.共形法向曲率与多元零截距测量误差模型中隐蔽观测值的检测

J Appl Stat. 2023 May 21;51(8):1545-1569. doi: 10.1080/02664763.2023.2212332. eCollection 2024.

Model averaging for right censored data with measurement error.带有测量误差的右删失数据的模型平均。

Lifetime Data Anal. 2024 Apr;30(2):501-527. doi: 10.1007/s10985-024-09620-3. Epub 2024 Mar 13.

Logistic regression error-in-covariate models for longitudinal high-dimensional covariates.用于纵向高维协变量的逻辑回归协变量误差模型。

Stat. 2019;8(1). doi: 10.1002/sta4.246. Epub 2019 Dec 26.

STRATOS guidance document on measurement error and misclassification of variables in observational epidemiology: Part 2-More complex methods of adjustment and advanced topics.STRATOS关于观察性流行病学中变量测量误差和错误分类的指南文件：第2部分 - 更复杂的调整方法和高级主题。

Stat Med. 2020 Jul 20;39(16):2232-2263. doi: 10.1002/sim.8531. Epub 2020 Apr 3.

本文引用的文献

On the degrees of freedom of reduced-rank estimators in multivariate regression.关于多元回归中降秩估计量的自由度

Biometrika. 2015;102(2):457-477. doi: 10.1093/biomet/asu067. Epub 2015 Feb 9.

Functional and Structural Methods with Mixed Measurement Error and Misclassification in Covariates.协变量存在混合测量误差和错误分类时的功能与结构方法

J Am Stat Assoc. 2015 Jun 1;110(510):681-696. doi: 10.1080/01621459.2014.922777.

Regularization Parameter Selections via Generalized Information Criterion.基于广义信息准则的正则化参数选择

J Am Stat Assoc. 2010 Mar 1;105(489):312-323. doi: 10.1198/jasa.2009.tm08013.

Nonparametric Prediction in Measurement Error Models.测量误差模型中的非参数预测

J Am Stat Assoc. 2009 Sep 1;104(487):993-1014. doi: 10.1198/jasa.2009.tm07543.

Variable Selection in Measurement Error Models.测量误差模型中的变量选择

Bernoulli (Andover). 2010;16(1):274-300. doi: 10.3150/09-bej205.

Variable Selection for Partially Linear Models with Measurement Errors.含测量误差的部分线性模型的变量选择

J Am Stat Assoc. 2009;104(485):234-248. doi: 10.1198/jasa.2009.0127.

Tuning parameter selectors for the smoothly clipped absolute deviation method.用于平滑截断绝对偏差方法的调优参数选择器。

Biometrika. 2007 Aug 1;94(3):553-568. doi: 10.1093/biomet/asm053.

A Note on Conditional AIC for Linear Mixed-Effects Models.关于线性混合效应模型的条件AIC的注释

Biometrika. 2008;95(3):773-778. doi: 10.1093/biomet/asn023.

Efficient regression calibration for logistic regression in main study/internal validation study designs with an imperfect reference instrument.在主要研究/内部验证研究设计中，针对使用不完善参考工具的逻辑回归进行高效回归校准。

Stat Med. 2001 Jan 15;20(1):139-160. doi: 10.1002/1097-0258(20010115)20:1<139::aid-sim644>3.0.co;2-k.

Comparison of the 60- and 100-item NCI-block questionnaires with validation data.将60项和100项美国国立癌症研究所模块问卷与验证数据进行比较。

Nutr Cancer. 1999;34(1):70-5. doi: 10.1207/S15327914NC340110.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验