Division of Epidemiology and Biostatistics, University of Illinois at Chicago, Chicago, IL 60612, USA.
Stat Med. 2012 Nov 30;31(27):3337-46. doi: 10.1002/sim.5362. Epub 2012 Apr 25.
Situations in which multiple outcomes and predictors of different distributional types are collected are becoming increasingly common in public health practice, and joint modeling of mixed types has been gaining popularity in recent years. Evaluation of various statistical techniques that have been developed for mixed data in simulated environments necessarily requires joint generation of multiple variables. Most massive public health data sets include different types of variables. For instance, in clustered or longitudinal designs, often multiple variables are measured or observed for each individual or at each occasion. This work is motivated by a need to jointly generate binary and possibly non-normal continuous variables. We illustrate the use of power polynomials to simulate multivariate mixed data on the basis of a real adolescent smoking study. We believe that our proposed technique for simulating such intensive data has the potential to be a handy methodological addition to public health researchers' toolkit.
在公共卫生实践中,收集多种不同分布类型的结局和预测因素的情况越来越常见,近年来混合类型的联合建模也越来越受欢迎。在模拟环境中评估为混合数据开发的各种统计技术,必然需要联合生成多个变量。大多数大规模公共卫生数据集包括不同类型的变量。例如,在聚类或纵向设计中,通常为每个个体或每个时间点测量或观察多个变量。这项工作的动机是需要联合生成二进制和可能非正态连续变量。我们基于一项真实的青少年吸烟研究,使用幂多项式来模拟多元混合数据。我们相信,我们提出的模拟这种密集数据的技术有可能成为公共卫生研究人员工具包的一个有用的方法学补充。