Suppr超能文献

美国疾病控制与预防中心新冠病毒病病例级监测数据中缺失种族和族裔信息的多重填补

Multiple Imputation of Missing Race and Ethnicity in CDC COVID-19 Case-Level Surveillance Data.

作者信息

Zhang Guangyu, Rose Charles E, Zhang Yujia, Li Rui, Lee Florence C, Massetti Greta, Adams Laura E

机构信息

CDC COVID-19 Response Team, Centers for Disease Control and Prevention, Atlanta, Georgia.

Health Resources and Services Administration, Rockville, Maryland, USA.

出版信息

Int J Stat Med Res. 2022 Jan 28;11:1-11. doi: 10.6000/1929-6029.2022.11.01.

Abstract

The COVID-19 pandemic has resulted in a disproportionate burden on racial and ethnic minority groups, but incompleteness in surveillance data limits understanding of disparities. CDC's case-based surveillance system contains case-level information on most COVID-19 cases in the United States. Data analyzed in this paper contain COVID-19 cases with case-level information through September 25, 2020, which represent 70.9% of all COVID-19 cases reported to CDC during the period. Case-level surveillance data are used to investigate COVID-19 disparities by race/ethnicity, sex, and age. However, demographic information on race and ethnicity is missing for a substantial percentage of COVID-19 cases (e.g., 35.8% and 47.2% of cases analyzed were missing race and ethnicity information, respectively). Our goal in this study was to impute missing race and ethnicity to derive more accurate incidence and incidence rate ratio (IRR) estimates for different racial and ethnic groups, and evaluate the results from imputation compared to complete case analysis, which involves removing cases with missing race/ethnicity information from the analysis. Two multiple imputation (MI) models were developed. Model 1 imputes race using six binary race variables, and Model 2 imputes race as a composite multinomial variable. Our evaluation found that compared with complete case analysis, MI reduced biases and improved coverage on incidence and IRR estimates for all race/ethnicity groups, except for the Non-Hispanic Multiple/other group. Our research highlights the importance of supplementing complete case analysis with additional methods of analysis to better describe racial and ethnic disparities. When race and ethnicity data are missing, multiple imputation may provide more accurate incidence and IRR estimates to monitor these disparities in tandem with efforts to improve the collection of race and ethnicity information for pandemic surveillance.

摘要

新冠疫情给种族和少数族裔群体带来了不成比例的负担,但监测数据的不完整性限制了对差异的理解。美国疾病控制与预防中心(CDC)基于病例的监测系统包含美国大多数新冠病例的病例级信息。本文分析的数据包含截至2020年9月25日有病例级信息的新冠病例,这些病例占该时期向CDC报告的所有新冠病例的70.9%。病例级监测数据用于按种族/族裔、性别和年龄调查新冠差异。然而,相当大比例的新冠病例缺少种族和族裔的人口统计学信息(例如,分析的病例中分别有35.8%和47.2%缺少种族和族裔信息)。我们这项研究的目的是对缺失的种族和族裔信息进行插补,以得出不同种族和族裔群体更准确的发病率和发病率比(IRR)估计值,并将插补结果与完全病例分析的结果进行评估,完全病例分析是指在分析中剔除缺少种族/族裔信息的病例。我们开发了两个多重插补(MI)模型。模型1使用六个二元种族变量插补种族,模型2将种族作为复合多项变量进行插补。我们的评估发现,与完全病例分析相比,多重插补减少了偏差,并改善了所有种族/族裔群体发病率和IRR估计值的覆盖范围,但非西班牙裔多重/其他群体除外。我们的研究强调了用额外的分析方法补充完全病例分析对于更好地描述种族和族裔差异的重要性。当种族和族裔数据缺失时,多重插补可能会提供更准确的发病率和IRR估计值,以便在努力改善大流行监测的种族和族裔信息收集工作的同时监测这些差异。

相似文献

引用本文的文献

本文引用的文献

7
COVID-19 and African Americans.新冠病毒与非裔美国人。
JAMA. 2020 May 19;323(19):1891-1892. doi: 10.1001/jama.2020.6548.
8
RIDDLE: Race and ethnicity Imputation from Disease history with Deep LEarning.RIDDLE:基于深度学习的疾病史中的种族和民族推断。
PLoS Comput Biol. 2018 Apr 26;14(4):e1006106. doi: 10.1371/journal.pcbi.1006106. eCollection 2018 Apr.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验