Fan Jianqing, Jiang Bai, Sun Qiang
Department of Operations Research and Financial Engineering, Princeton University, Princeton, NJ 08544.
Department of Statistical Sciences, University of Toronto, Toronto, ON M5S 3G3.
J Econom. 2022 Sep;230(1):3-19. doi: 10.1016/j.jeconom.2020.06.012. Epub 2021 Nov 1.
Many sparse regression methods are based on the assumption that covariates are weakly correlated, which unfortunately do not hold in many economic and financial datasets. To address this challenge, we model the strongly-correlated covariates by a factor structure: strong correlations among covariates are explained by common factors and the remaining variations are interpreted as idiosyncratic components. We then propose a factor-adjusted sparse regression model with both common factors and idiosyncratic components as decorrelated covariates and develop a semi-Bayesian method. Parameter estimation rate-optimality and model selection consistency are established by non-asymptotic analyses. We show on simulated data that the semi-Bayesian method outperforms its Lasso analogue, manifests insensitivity to the overestimates of the number of common factors, pays a negligible price when covariates are not correlated, scales up well with increasing sample size, dimensionality and sparsity, and converges fast to the equilibrium of the posterior distribution. Numerical results on a real dataset of U.S. bond risk premia and macroeconomic indicators also lend strong supports to the proposed method.
许多稀疏回归方法基于协变量弱相关的假设,但遗憾的是,这一假设在许多经济和金融数据集中并不成立。为应对这一挑战,我们通过因子结构对强相关协变量进行建模:协变量之间的强相关性由公共因子解释,其余变化则被解释为特质成分。然后,我们提出一种因子调整的稀疏回归模型,将公共因子和特质成分都作为去相关的协变量,并开发了一种半贝叶斯方法。通过非渐近分析建立了参数估计的速率最优性和模型选择的一致性。我们在模拟数据上表明,半贝叶斯方法优于其Lasso类似方法,对公共因子数量的高估不敏感,在协变量不相关时代价可忽略不计,随着样本量、维度和稀疏性的增加扩展性良好,并且能快速收敛到后验分布的均衡状态。关于美国债券风险溢价和宏观经济指标的真实数据集的数值结果也为所提出的方法提供了有力支持。