• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

热插补多重填补与加权似然之间的关系。

The relationship between hot-deck multiple imputation and weighted likelihood.

作者信息

Reilly M, Pepe M

机构信息

Department of Statistics, University College Dublin, Belfield, Ireland.

出版信息

Stat Med. 1997;16(1-3):5-19. doi: 10.1002/(sici)1097-0258(19970115)16:1<5::aid-sim469>3.0.co;2-8.

DOI:10.1002/(sici)1097-0258(19970115)16:1<5::aid-sim469>3.0.co;2-8
PMID:9004380
Abstract

Hot-deck imputation is an intuitively simple and popular method of accommodating incomplete data. Users of the method will often use the usual multiple imputation variance estimator which is not appropriate in this case. However, no variance expression has yet been derived for this easily implemented method applied to missing covariates in regression models. The simple hot-deck method is in fact asymptotically equivalent to the mean-score method for the estimation of a regression model parameter, so that hot-deck can be understood in the context of likelihood methods. Both of these methods accommodate data where missingness may depend on the observed variables but not on the unobserved value of the incomplete covariate, that is, missing at random (MAR). The asymptotic properties of hot-deck are derived here for the case where the fully observed variables are categorical, though the incomplete covariate(s) may be continuous. Simulation studies indicate that the two methods compare well in small samples and for small numbers of imputations. Current users of hot-deck may now conduct their analysis using mean-score, which is a weighted likelihood method and can thus be implemented by a single pass through the data using any standard package which accommodates weighted regression models. Valid inference is now straightforward using the variance expression provided here. The equivalence of mean-score and hot-deck is illustrated using three clinical data sets where an important covariate is missing for a large number of study subjects.

摘要

热卡填补是一种直观简单且流行的处理不完全数据的方法。该方法的使用者常常会使用通常的多重填补方差估计量,而这在这种情况下并不合适。然而,对于应用于回归模型中缺失协变量的这种易于实施的方法,尚未推导出方差表达式。事实上,简单热卡方法在渐近意义上等同于用于估计回归模型参数的均值得分方法,这样热卡方法就可以在似然方法的背景下被理解。这两种方法都适用于缺失情况可能依赖于观测变量但不依赖于不完全协变量的未观测值的数据,也就是说,随机缺失(MAR)。本文针对完全观测变量为分类变量的情况推导了热卡方法的渐近性质,尽管不完全协变量可能是连续的。模拟研究表明,这两种方法在小样本和少量填补情况下表现相当。热卡方法的现有使用者现在可以使用均值得分进行分析,均值得分是一种加权似然方法,因此可以通过使用任何适用于加权回归模型的标准软件包对数据进行单次遍历来实现。现在使用本文提供的方差表达式进行有效推断很简单。使用三个临床数据集说明了均值得分和热卡方法的等价性,在这些数据集中,大量研究对象缺失一个重要协变量。

相似文献

1
The relationship between hot-deck multiple imputation and weighted likelihood.热插补多重填补与加权似然之间的关系。
Stat Med. 1997;16(1-3):5-19. doi: 10.1002/(sici)1097-0258(19970115)16:1<5::aid-sim469>3.0.co;2-8.
2
Multiple imputation using an iterative hot-deck with distance-based donor selection.使用基于距离的供体选择的迭代热插补法进行多重填补。
Stat Med. 2008 Jan 15;27(1):83-102. doi: 10.1002/sim.3001.
3
Missing data methods for dealing with missing items in quality of life questionnaires. A comparison by simulation of personal mean score, full information maximum likelihood, multiple imputation, and hot deck techniques applied to the SF-36 in the French 2003 decennial health survey.缺失数据方法处理生活质量问卷中的缺失项。通过模拟个人均数、完全信息极大似然、多重插补和热deck 技术在法国 2003 年十年健康调查中的 SF-36 中的应用,对这些方法进行比较。
Qual Life Res. 2011 Mar;20(2):287-300. doi: 10.1007/s11136-010-9740-3. Epub 2010 Oct 1.
4
Implementing Multiple Imputation for Missing Data in Longitudinal Studies When Models are Not Feasible: An Example Using the Random Hot Deck Approach.当模型不可行时,在纵向研究中对缺失数据实施多重填补:使用随机热卡方法的一个示例。
Clin Epidemiol. 2022 Nov 15;14:1387-1403. doi: 10.2147/CLEP.S368303. eCollection 2022.
5
On the multiple imputation variance estimator for control-based and delta-adjusted pattern mixture models.关于基于控制和增量调整模式混合模型的多重填补方差估计器
Biometrics. 2017 Dec;73(4):1379-1387. doi: 10.1111/biom.12702. Epub 2017 Apr 13.
6
Imputation of missing variance data using non-linear mixed effects modelling to enable an inverse variance weighted meta-analysis of summary-level longitudinal data: a case study.使用非线性混合效应模型估算缺失的方差数据以实现汇总水平纵向数据的逆方差加权荟萃分析:一项案例研究
Pharm Stat. 2012 Jul-Aug;11(4):318-24. doi: 10.1002/pst.1515. Epub 2012 May 7.
7
A nonparametric multiple imputation approach for missing categorical data.一种针对缺失分类数据的非参数多重填补方法。
BMC Med Res Methodol. 2017 Jun 6;17(1):87. doi: 10.1186/s12874-017-0360-2.
8
Multiple imputation techniques in small sample clinical trials.小样本临床试验中的多重填补技术
Stat Med. 2006 Jan 30;25(2):233-45. doi: 10.1002/sim.2231.
9
Asymptotic theory and inference of predictive mean matching imputation using a superpopulation model framework.基于超总体模型框架的预测均值匹配插补的渐近理论与推断
Scand Stat Theory Appl. 2020 Sep;47(3):839-861. doi: 10.1111/sjos.12429. Epub 2019 Nov 8.
10
Multinomial logistic regression with missing outcome data: An application to cancer subtypes.多项逻辑回归处理缺失结局数据:在癌症亚型中的应用。
Stat Med. 2020 Oct 30;39(24):3299-3312. doi: 10.1002/sim.8666. Epub 2020 Jul 6.

引用本文的文献

1
Effects of Using a Smart Bassinet on the Mental Health of Military-Affiliated Pregnant Women: Protocol for a Randomized Controlled Sleep Health and Mood in Newly Expectant Military Mothers (SHINE) Trial.使用智能婴儿床对军属孕妇心理健康的影响:新怀孕军属母亲睡眠健康与情绪随机对照试验(SHINE)方案
JMIR Res Protoc. 2025 Apr 10;14:e66439. doi: 10.2196/66439.
2
Health care utilization and receipt of preventive care for patients seen at federally funded health centers compared to other sites of primary care.与其他初级保健场所相比,在联邦政府资助的健康中心就诊的患者的医疗保健利用率和接受预防保健的情况。
Health Serv Res. 2014 Oct;49(5):1498-518. doi: 10.1111/1475-6773.12178. Epub 2014 Apr 30.
3
Transforming growth factor beta-1 and incidence of heart failure in older adults: the Cardiovascular Health Study.
转化生长因子-β1 与老年人心力衰竭的发生:心血管健康研究。
Cytokine. 2012 Nov;60(2):341-5. doi: 10.1016/j.cyto.2012.07.013. Epub 2012 Aug 9.
4
Partial linear inference for a 2-stage outcome-dependent sampling design with a continuous outcome.具有连续结局的 2 阶段依结局抽样设计的部分线性推断。
Biostatistics. 2011 Jul;12(3):506-20. doi: 10.1093/biostatistics/kxq070. Epub 2010 Dec 14.
5
HANDLING MISSING DATA BY DELETING COMPLETELY OBSERVED RECORDS.通过删除完全观测记录来处理缺失数据。
J Stat Plan Inference. 2009 Jul 1;139(7):2341-2350. doi: 10.1016/j.jspi.2008.10.024.
6
Lifetime total physical activity and prostate cancer risk: a population-based case-control study in Sweden.终生总体身体活动与前列腺癌风险:瑞典一项基于人群的病例对照研究
Eur J Epidemiol. 2008;23(11):739-46. doi: 10.1007/s10654-008-9294-7. Epub 2008 Oct 18.