Leroux Andrew, Di Junrui, Smirnova Ekaterina, Mcguffey Elizabeth J, Cao Quy, Bayatmokhtari Elham, Tabacu Lucia, Zipunnikov Vadim, Urbanek Jacek K, Crainiceanu Ciprian
Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, USA.
Department of Biostatistics, Virginia Commonwealth University, Richmond, USA.
Stat Biosci. 2019 Jul;11(2):262-287. doi: 10.1007/s12561-018-09229-9. Epub 2019 Feb 9.
The NHANES study contains objectively measured physical activity data collected using hip-worn accelerometers from multiple cohorts. However, using the accelerometry data has proven daunting because: 1) currently, there are no agreed upon standard protocols for data storage and analysis; 2) data exhibit heterogeneous patterns of missingness due to varying degrees of adherence to wear-time protocols; 3) sampling weights need to be carefully adjusted and accounted for in individual analyses; 4) there is a lack of reproducible software that transforms the data from its published format into analytic form; and 5) the high dimensional nature of accelerometry data complicates analyses. Here, we provide a framework for processing, storing, and analyzing the NHANES accelerometry data for the 2003-2004 and 2005-2006 surveys. We also provide an NHANES data package in R, to help disseminate high quality, processed activity data combined with mortality and demographic information. Thus, we provide the tools to transition from "available data online" to "easily accessible and usable data", which substantially reduces the large upfront costs of initiating studies of association between physical activity and human health outcomes using NHANES. We apply these tools in an analysis showing that accelerometry features have the potential to predict 5-year all cause mortality better than known risk factors such as age, cigarette smoking, and various comorbidities.
美国国家健康与营养检查调查(NHANES)研究包含了使用佩戴在髋部的加速度计从多个队列中收集的客观测量的身体活动数据。然而,使用加速度计数据已被证明具有挑战性,原因如下:1)目前,尚无公认的数据存储和分析标准方案;2)由于对佩戴时间方案的遵守程度不同,数据呈现出异质性的缺失模式;3)在个体分析中需要仔细调整和考虑抽样权重;4)缺乏将数据从其发布格式转换为分析形式的可重复使用的软件;5)加速度计数据的高维度性质使分析变得复杂。在此,我们提供了一个用于处理、存储和分析2003 - 2004年和2005 - 2006年NHANES加速度计数据的框架。我们还在R语言中提供了一个NHANES数据包,以帮助传播结合了死亡率和人口统计信息的高质量、经过处理的活动数据。因此,我们提供了从“在线可用数据”过渡到“易于获取和使用的数据”的工具,这大大降低了使用NHANES开展身体活动与人类健康结果之间关联研究的前期巨大成本。我们在一项分析中应用了这些工具,结果表明加速度计特征比年龄、吸烟和各种合并症等已知风险因素更有潜力预测5年全因死亡率。