Guo Jiaqi, Gelfand Saul B, Hennessy Erin, Aqeel Marah M, Eicher-Miller Heather A, Richards Elizabeth A, Lin Luotao, Bhadra Anindya, Delp Edward J
School of Electrical and Computer Engineering, Purdue University West Lafayette, IN, USA.
Friedman School of Nutrition Science and Policy, Tufts University Boston MA, USA.
medRxiv. 2023 Jan 26:2023.01.23.23284777. doi: 10.1101/2023.01.23.23284777.
Physical activity (PA) is known to be a risk factor for obesity and chronic diseases such as diabetes and metabolic syndrome. Few attempts have been made to pattern the time of physical activity while incorporating intensity and duration in order to determine the relationship of this multi-faceted behavior with health. In this paper, we explore a distance-based approach for clustering daily physical activity time series to estimate temporal physical activity patterns among U.S. adults (ages 20-65) from the National Health and Nutrition Examination Survey 2003-2006 (NHANES). A number of distance measures and distance-based clustering methods were investigated and compared using various metrics. These metrics include the Silhouette and the Dunn Index (internal criteria), and the associations of the clusters with health status indicators (external criteria). Our experiments indicate that using a distance-based cluster analysis approach to estimate temporal physical activity patterns through the day, has the potential to describe the complexity of behavior rather than characterizing physical activity patterns solely by sums or labels of maximum activity levels.
众所周知,身体活动(PA)是肥胖以及糖尿病和代谢综合征等慢性疾病的一个风险因素。为了确定这种多方面行为与健康之间的关系,人们很少尝试在纳入强度和持续时间的同时对身体活动时间进行模式化。在本文中,我们探索一种基于距离的方法,用于对日常身体活动时间序列进行聚类,以估计来自2003 - 2006年国家健康和营养检查调查(NHANES)的美国成年人(年龄在20 - 65岁之间)的时间性身体活动模式。我们研究并比较了多种距离度量和基于距离的聚类方法,并使用了各种指标。这些指标包括轮廓系数和邓恩指数(内部标准),以及聚类与健康状况指标之间的关联(外部标准)。我们的实验表明,使用基于距离的聚类分析方法来估计一天中的时间性身体活动模式,有可能描述行为的复杂性,而不是仅仅通过最大活动水平的总和或标签来表征身体活动模式。