Méndez-Civieta Álvaro, Wei Ying, Diaz Keith M, Goldsmith Jeff
Department of Biostatistics, Columbia University, 722W 178 St, New York, NY 10032, United States.
uc3m-Santander Big Data Institute, University Carlos III of Madrid, C. Madrid, 126, Madrid 28903, Spain.
Biostatistics. 2024 Dec 31;26(1). doi: 10.1093/biostatistics/kxae040.
This paper introduces functional quantile principal component analysis (FQPCA), a dimensionality reduction technique that extends the concept of functional principal components analysis (FPCA) to the examination of participant-specific quantiles curves. Our approach borrows strength across participants to estimate patterns in quantiles, and uses participant-level data to estimate loadings on those patterns. As a result, FQPCA is able to capture shifts in the scale and distribution of data that affect participant-level quantile curves, and is also a robust methodology suitable for dealing with outliers, heteroscedastic data or skewed data. The need for such methodology is exemplified by physical activity data collected using wearable devices. Participants often differ in the timing and intensity of physical activity behaviors, and capturing information beyond the participant-level expected value curves produced by FPCA is necessary for a robust quantification of diurnal patterns of activity. We illustrate our methods using accelerometer data from the National Health and Nutrition Examination Survey, and produce participant-level 10%, 50%, and 90% quantile curves over 24 h of activity. The proposed methodology is supported by simulation results, and is available as an R package.
本文介绍了功能分位数主成分分析(FQPCA),这是一种降维技术,它将功能主成分分析(FPCA)的概念扩展到对参与者特定分位数曲线的检验。我们的方法利用参与者之间的优势来估计分位数中的模式,并使用参与者层面的数据来估计这些模式上的载荷。因此,FQPCA能够捕捉影响参与者层面分位数曲线的数据尺度和分布的变化,并且也是一种适用于处理异常值、异方差数据或偏态数据的稳健方法。使用可穿戴设备收集的身体活动数据就体现了对这种方法的需求。参与者在身体活动行为的时间和强度上往往存在差异,对于活动昼夜模式的稳健量化而言,捕捉FPCA产生的参与者层面期望值曲线之外的信息是必要的。我们使用来自国家健康与营养检查调查的加速度计数据来说明我们的方法,并生成24小时活动期间参与者层面的10%、50%和90%分位数曲线。所提出的方法得到了模拟结果的支持,并且可以作为一个R包获取。