Suppr超能文献

协变量存在相关测量误差和错误分类时的逻辑回归

Logistic regression with correlated measurement error and misclassification in covariates.

作者信息

Cao Zhiqiang, Wong Man Yu, Cheng Garvin Hl

机构信息

College of Big Data and Internet, Shenzhen Technology University, Shenzhen, China.

Department of Mathematics, The Hong Kong University of Science and Technology, Hong Kong, China.

出版信息

Stat Methods Med Res. 2023 Apr;32(4):789-805. doi: 10.1177/09622802231154324. Epub 2023 Feb 15.

Abstract

Many areas of research, such as nutritional epidemiology, may encounter measurement errors of continuous covariates and misclassification of categorical variables when modeling. It is well known that ignoring measurement errors or misclassification can lead to biased results. But most research has focused on solving these two problems separately. Addressing both measurement error and misclassification simultaneously in a single analysis is less actively studied. In this article, we propose a new correction method for a logistic regression to handle correlated error variables involved in multivariate continuous covariates and misclassification in a categorical variable simultaneously. It is not computationally intensive since a closed-form of the approximate likelihood function conditional on observed covariates is derived. The asymptotic normality of this proposed estimator is established under regularity conditions and its finite-sample performance is empirically examined by simulation studies. We apply this new estimation method to handle measurement error in some nutrients of interest and misclassification of a categorical variable named physical activity in the European Prospective Investigation into Cancer and Nutrition-InterAct Study data. Analyses show that fruit is negatively associated with type 2 diabetes for a group of women doing active physical activity, protein has positive association with type 2 diabetes for the group of less active physical activity, and actual physical activity has a greater effect on reducing the risk of type 2 diabetes than observed physical activity.

摘要

许多研究领域,如营养流行病学,在建模时可能会遇到连续协变量的测量误差和分类变量的错误分类问题。众所周知,忽略测量误差或错误分类会导致有偏差的结果。但大多数研究都分别聚焦于解决这两个问题。在单一分析中同时处理测量误差和错误分类的研究则较少。在本文中,我们提出了一种用于逻辑回归的新校正方法,以同时处理多变量连续协变量中涉及的相关误差变量和分类变量中的错误分类。由于推导了基于观测协变量的近似似然函数的闭式,所以计算量不大。在正则条件下建立了该估计量的渐近正态性,并通过模拟研究对其有限样本性能进行了实证检验。我们应用这种新的估计方法来处理欧洲癌症与营养前瞻性调查 - 交互作用研究数据中一些感兴趣营养素的测量误差以及一个名为身体活动的分类变量的错误分类。分析表明,对于一组进行积极身体活动的女性,水果与2型糖尿病呈负相关;对于身体活动较少的一组女性,蛋白质与2型糖尿病呈正相关;并且实际身体活动比观测到的身体活动对降低2型糖尿病风险的影响更大。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验