Department of Epidemiology and Biostatistics, Milken Institute School of Public Health, The George Washington University, Washington, DC.
Department of Mathematics and Statistics, The University of Arkansas at Little Rock, Little Rock, AR.
Health Serv Res. 2018 Jun;53(3):1870-1889. doi: 10.1111/1475-6773.12704. Epub 2017 May 4.
To identify the most appropriate imputation method for missing data in the HCUP State Inpatient Databases (SID) and assess the impact of different missing data methods on racial disparities research.
DATA SOURCES/STUDY SETTING: HCUP SID.
A novel simulation study compared four imputation methods (random draw, hot deck, joint multiple imputation [MI], conditional MI) for missing values for multiple variables, including race, gender, admission source, median household income, and total charges. The simulation was built on real data from the SID to retain their hierarchical data structures and missing data patterns. Additional predictive information from the U.S. Census and American Hospital Association (AHA) database was incorporated into the imputation.
Conditional MI prediction was equivalent or superior to the best performing alternatives for all missing data structures and substantially outperformed each of the alternatives in various scenarios.
Conditional MI substantially improved statistical inferences for racial health disparities research with the SID.
确定 HCUP 州立住院患者数据库(SID)中缺失数据最合适的插补方法,并评估不同缺失数据方法对种族差异研究的影响。
数据来源/研究环境:HCUP SID。
一项新的模拟研究比较了四种插补方法(随机抽取、热甲板、联合多重插补[MI]、条件 MI)对于包括种族、性别、入院来源、中等家庭收入和总费用在内的多个变量的缺失值。该模拟是基于 SID 中的真实数据构建的,以保留其层次数据结构和缺失数据模式。还将来自美国人口普查和美国医院协会(AHA)数据库的额外预测信息纳入插补。
对于所有缺失数据结构,条件 MI 预测等同于或优于表现最好的替代方法,并且在各种情况下都大大优于每个替代方法。
条件 MI 极大地提高了 SID 中种族健康差异研究的统计推断。