Louzada Francisco, Shimizu Taciana Ko, Suzuki Adriano K
Department of Applied Mathematics & Statistics, ICMC, University of São Paulo, São Carlos, SP, Brazil.
Stat Methods Med Res. 2020 May;29(5):1434-1446. doi: 10.1177/0962280219863817. Epub 2019 Jul 23.
There are considerable challenges in analyzing large-scale compositional data. In this paper, we introduce the Spike-and-Slab Lasso linear regression in the presence of compositional covariates for parameter estimation and variable selection. We consider the well-known isometric log-ratio (ilr) coordinates to avoid misleading statistical inference. The separable and non-separable (adaptative) Spike-and-Slab Lasso penalties are compared to verify the advantages of each approach. The proposed method is illustrated on simulated and on real Brazilian child malnutrition data.
在分析大规模成分数据时存在相当大的挑战。在本文中,我们引入了在存在成分协变量的情况下用于参数估计和变量选择的尖劈-平板套索线性回归。我们考虑使用著名的等距对数比(ilr)坐标来避免误导性的统计推断。对可分离和不可分离(自适应)的尖劈-平板套索惩罚进行了比较,以验证每种方法的优点。所提出的方法在模拟数据和真实的巴西儿童营养不良数据上进行了说明。