Department of Statistics, North Carolina State University, Raleigh, North Carolina.
Biometrics. 2023 Mar;79(1):151-164. doi: 10.1111/biom.13576. Epub 2021 Nov 10.
Flexible estimation of multiple conditional quantiles is of interest in numerous applications, such as studying the effect of pregnancy-related factors on low and high birth weight. We propose a Bayesian nonparametric method to simultaneously estimate noncrossing, nonlinear quantile curves. We expand the conditional distribution function of the response in I-spline basis functions where the covariate-dependent coefficients are modeled using neural networks. By leveraging the approximation power of splines and neural networks, our model can approximate any continuous quantile function. Compared to existing models, our model estimates all rather than a finite subset of quantiles, scales well to high dimensions, and accounts for estimation uncertainty. While the model is arbitrarily flexible, interpretable marginal quantile effects are estimated using accumulative local effect plots and variable importance measures. A simulation study shows that our model can better recover quantiles of the response distribution when the data are sparse, and an analysis of birth weight data is presented.
在许多应用中,如研究与妊娠相关的因素对低出生体重和高出生体重的影响,灵活估计多个条件分位数是很有意义的。我们提出了一种贝叶斯非参数方法来同时估计非交叉、非线性分位数曲线。我们将响应的条件分布函数扩展到 I 样条基函数中,其中协变量相关系数使用神经网络进行建模。通过利用样条和神经网络的逼近能力,我们的模型可以逼近任何连续的分位数函数。与现有模型相比,我们的模型估计的是所有分位数,而不是有限的分位数子集,对高维数据具有良好的扩展性,并考虑了估计不确定性。虽然模型具有任意的灵活性,但使用累积局部效应图和变量重要性度量来估计可解释的边际分位数效应。一项模拟研究表明,当数据稀疏时,我们的模型可以更好地恢复响应分布的分位数,并且还呈现了一项出生体重数据分析。