Grigsby Matthew R, Di Junrui, Leroux Andrew, Zipunnikov Vadim, Xiao Luo, Crainiceanu Ciprian, Checkley William
1Division of Pulmonary and Critical Care, School of Medicine, Johns Hopkins University, 1830 E. Monument Street, 5th Floor, Baltimore, MD 21287 USA.
2Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD USA.
Emerg Themes Epidemiol. 2018 Feb 23;15:4. doi: 10.1186/s12982-018-0072-z. eCollection 2018.
Literature surrounding the statistical modeling of childhood growth data involves a diverse set of potential models from which investigators can choose. However, the lack of a comprehensive framework for comparing non-nested models leads to difficulty in assessing model performance. This paper proposes a framework for comparing non-nested growth models using novel metrics of predictive accuracy based on modifications of the mean squared error criteria.
Three metrics were created: normalized, age-adjusted, and weighted mean squared error (MSE). Predictive performance metrics were used to compare linear mixed effects models and functional regression models. Prediction accuracy was assessed by partitioning the observed data into training and test datasets. This partitioning was constructed to assess prediction accuracy for backward (i.e., early growth), forward (i.e., late growth), in-range, and on new-individuals. Analyses were done with height measurements from 215 Peruvian children with data spanning from near birth to 2 years of age.
Functional models outperformed linear mixed effects models in all scenarios tested. In particular, prediction errors for functional concurrent regression (FCR) and functional principal component analysis models were approximately 6% lower when compared to linear mixed effects models. When we weighted subject-specific MSEs according to subject-specific growth rates during infancy, we found that FCR was the best performer in all scenarios.
With this novel approach, we can quantitatively compare non-nested models and weight subgroups of interest to select the best performing growth model for a particular application or problem at hand.
围绕儿童生长数据统计建模的文献涉及一系列不同的潜在模型,供研究人员选择。然而,缺乏一个用于比较非嵌套模型的综合框架,导致难以评估模型性能。本文基于对均方误差标准的修改,提出了一个使用预测准确性新指标来比较非嵌套生长模型的框架。
创建了三个指标:归一化、年龄调整和加权均方误差(MSE)。使用预测性能指标来比较线性混合效应模型和功能回归模型。通过将观测数据划分为训练和测试数据集来评估预测准确性。这种划分旨在评估向后(即早期生长)、向前(即晚期生长)、范围内以及对新个体的预测准确性。使用来自215名秘鲁儿童的身高测量数据进行分析,数据涵盖从接近出生到2岁。
在所有测试场景中,功能模型的表现均优于线性混合效应模型。特别是,与线性混合效应模型相比,功能并发回归(FCR)和功能主成分分析模型的预测误差大约低6%。当我们根据婴儿期个体特定的生长速率对个体特定的MSE进行加权时,我们发现FCR在所有场景中表现最佳。
通过这种新方法,我们可以定量比较非嵌套模型,并对感兴趣的亚组进行加权,以选择针对特定应用或手头问题表现最佳的生长模型。