School of Chinese Materia Medica, Beijing University of Chinese Medicine, Beijing, China.
Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China.
Biometrics. 2023 Mar;79(1):178-189. doi: 10.1111/biom.13574. Epub 2021 Oct 28.
In this paper, we propose a frequentist model averaging method for quantile regression with high-dimensional covariates. Although research on these subjects has proliferated as separate approaches, no study has considered them in conjunction. Our method entails reducing the covariate dimensions through ranking the covariates based on marginal quantile utilities. The second step of our method implements model averaging on the models containing the covariates that survive the screening of the first step. We use a delete-one cross-validation method to select the model weights, and prove that the resultant estimator possesses an optimal asymptotic property uniformly over any compact (0,1) subset of the quantile indices. Our proof, which relies on empirical process theory, is arguably more challenging than proofs of similar results in other contexts owing to the high-dimensional nature of the problem and our relaxation of the conventional assumption of the weights summing to one. Our investigation of finite-sample performance demonstrates that the proposed method exhibits very favorable properties compared to the least absolute shrinkage and selection operator (LASSO) and smoothly clipped absolute deviation (SCAD) penalized regression methods. The method is applied to a microarray gene expression data set.
在本文中,我们提出了一种用于高维协变量分位数回归的频率主义模型平均方法。尽管这些主题的研究已经作为单独的方法而激增,但没有研究将它们结合起来考虑。我们的方法通过基于边际分位数效用对协变量进行排序来降低协变量的维度。我们方法的第二步是在包含通过第一步筛选存活的协变量的模型上进行模型平均。我们使用删除一个交叉验证方法来选择模型权重,并证明所得到的估计量在任何有界(0,1)分位数指数子集中都具有最优的渐近性质。我们的证明依赖于经验过程理论,由于问题的高维性质以及我们放宽了权重之和为一的传统假设,因此与其他情况下类似结果的证明相比,可能更具挑战性。我们对有限样本性能的研究表明,与最小绝对收缩和选择算子(LASSO)和光滑剪辑绝对偏差(SCAD)惩罚回归方法相比,所提出的方法具有非常有利的性质。该方法应用于微阵列基因表达数据集。