Suppr超能文献

缺失实验室数据的多重插补:传染病流行病学实例。

Multiple imputation for missing laboratory data: an example from infectious disease epidemiology.

机构信息

Department of Obstetrics and Gynecology, Paul L. Foster School of Medicine, Texas Tech University Health Sciences Center, 4800 Alberta Ave., El Paso, TX 79905, USA.

出版信息

Ann Epidemiol. 2009 Dec;19(12):908-14. doi: 10.1016/j.annepidem.2009.08.002. Epub 2009 Oct 6.

Abstract

PURPOSE

To present multiple imputation (MI) as an appropriate method to address missing values for a laboratory parameter (serum albumin) in an epidemiologic study.

METHODS

A data set of patients who were hospitalized for invasive group A streptococcal infections was accessed. Age was the exposure of interest. The outcome was hospital mortality. Several variables, including serum albumin, were considered to be potential confounders. Of the 201 records, 91 had missing values for serum albumin. The MI procedure in SAS was used to perform 20 imputations of serum albumin by using a Markov chain Monte Carlo approach. Logistic regression was then performed on each of the 20 filled-in data sets, and the results were appropriately combined by using the MIANALYZE procedure.

RESULTS

Age (> or = 55 years vs. 0-54 years) was not a risk factor for hospital mortality in the complete-case analysis (n = 110): adjusted odds ratio (OR) = 2.43 (95% confidence interval [CI]: 0.79-7.53). Age was a significant risk factor in the imputed data set (n = 201): adjusted OR = 3.08 (95% CI: 1.22-7.78).

CONCLUSIONS

Epidemiologists frequently encounter data sets that contain missing values. Traditional missing data techniques such as the complete-subject analysis may lead to biased results. We have demonstrated the use of a novel technique, MI, to account for missing data.

摘要

目的

介绍多重插补(MI)作为一种合适的方法,用于解决流行病学研究中实验室参数(血清白蛋白)缺失值的问题。

方法

访问了一组因侵袭性 A 组链球菌感染住院的患者数据。年龄是感兴趣的暴露因素。结局是医院死亡率。包括血清白蛋白在内的几个变量被认为是潜在的混杂因素。在 201 份记录中,有 91 份记录的血清白蛋白值缺失。使用 SAS 中的 MI 程序,通过马尔可夫链蒙特卡罗方法对血清白蛋白进行 20 次插补。然后在每个填充数据集中进行逻辑回归,并使用 MIANALYZE 程序对结果进行适当的组合。

结果

在完整病例分析(n=110)中,年龄(≥55 岁与 0-54 岁)不是医院死亡率的危险因素:调整后的比值比(OR)=2.43(95%置信区间[CI]:0.79-7.53)。在插补数据集中(n=201),年龄是一个显著的危险因素:调整后的 OR = 3.08(95% CI:1.22-7.78)。

结论

流行病学家经常遇到包含缺失值的数据。传统的缺失数据技术,如完整对象分析,可能导致有偏的结果。我们已经演示了使用一种新的技术,即 MI,来处理缺失数据。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验