使用希尔伯特曲线基选择的平滑样条逼近

Smoothing splines approximation using Hilbert curve basis selection.

作者信息

Meng Cheng, Yu Jun, Chen Yongkai, Zhong Wenxuan, Ma Ping

机构信息

Institute of Statistics and Big Data, Renmin University of China.

School of Mathematics and Statistics, Beijing Institute of Technology.

出版信息

J Comput Graph Stat. 2022;31(3):802-812. doi: 10.1080/10618600.2021.2002161. Epub 2022 Jan 12.

DOI:10.1080/10618600.2021.2002161

PMID:36407675

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9674117/

Abstract

Smoothing splines have been used pervasively in nonparametric regressions. However, the computational burden of smoothing splines is significant when the sample size is large. When the number of predictors ≥ 2 , the computational cost for smoothing splines is at the order of ( ) using the standard approach. Many methods have been developed to approximate smoothing spline estimators by using basis functions instead of ones, resulting in a computational cost of the order ( ). These methods are called the basis selection methods. Despite algorithmic benefits, most of the basis selection methods require the assumption that the sample is uniformly-distributed on a hyper-cube. These methods may have deteriorating performance when such an assumption is not met. To overcome the obstacle, we develop an efficient algorithm that is adaptive to the unknown probability density function of the predictors. Theoretically, we show the proposed estimator has the same convergence rate as the full-basis estimator when is roughly at the order of [ ] , where ∈[1, 2] and ≈ 4 are some constants depend on the type of the spline. Numerical studies on various synthetic datasets demonstrate the superior performance of the proposed estimator in comparison with mainstream competitors.

摘要

平滑样条已在非参数回归中被广泛使用。然而，当样本量很大时，平滑样条的计算负担很重。当预测变量的数量≥2时，使用标准方法，平滑样条的计算成本约为( )。已经开发了许多方法，通过使用基函数而不是( )来近似平滑样条估计量，从而产生了约为( )的计算成本。这些方法被称为基选择方法。尽管有算法优势，但大多数基选择方法都需要假设样本在超立方上均匀分布。当不满足这样的假设时，这些方法的性能可能会变差。为了克服这一障碍，我们开发了一种高效算法，该算法能适应预测变量未知的概率密度函数。从理论上讲，我们表明当( )大致为[ ]的量级时，所提出的估计量与全基估计量具有相同的收敛速度，其中∈[1, 2]且≈4是一些取决于样条类型的常数。在各种合成数据集上的数值研究表明，与主流竞争对手相比，所提出的估计量具有优越的性能。

相似文献

Smoothing splines approximation using Hilbert curve basis selection.使用希尔伯特曲线基选择的平滑样条逼近

J Comput Graph Stat. 2022;31(3):802-812. doi: 10.1080/10618600.2021.2002161. Epub 2022 Jan 12.

More efficient approximation of smoothing splines via space-filling basis selection.通过空间填充基选择对平滑样条进行更高效的近似。

Biometrika. 2020 Sep;107(3):723-735. doi: 10.1093/biomet/asaa019. Epub 2020 May 7.

Sparse and Efficient Estimation for Partial Spline Models with Increasing Dimension.高维部分样条模型的稀疏有效估计

Ann Inst Stat Math. 2015 Feb 1;67(1):93-127. doi: 10.1007/s10463-013-0440-y.

An asymptotic and empirical smoothing parameters selection method for smoothing spline ANOVA models in large samples.大样本中平滑样条方差分析模型的渐近及经验平滑参数选择方法

Biometrika. 2021 Mar;108(1):149-166. doi: 10.1093/biomet/asaa047. Epub 2020 Aug 27.

Comparing Smoothing Techniques for Fitting the Nonlinear Effect of Covariate in Cox Models.比较用于拟合Cox模型中协变量非线性效应的平滑技术。

Acta Inform Med. 2016 Feb;24(1):38-41. doi: 10.5455/aim.2016.24.38-41. Epub 2016 Feb 2.

Automatic search intervals for the smoothing parameter in penalized splines.惩罚样条中平滑参数的自动搜索间隔

Stat Comput. 2023;33(1):1. doi: 10.1007/s11222-022-10178-z. Epub 2022 Nov 18.

Automatic Model Selection for Partially Linear Models.部分线性模型的自动模型选择

J Multivar Anal. 2009 Oct 1;100(9):2100-2111. doi: 10.1016/j.jmva.2009.06.009.

A note on a nonparametric regression test through penalized splines.关于通过惩罚样条进行的非参数回归检验的一则注释。

Stat Sin. 2014;24:1143-1160. doi: 10.5705/ss.2012.230.

An efficient penalized estimation approach for semiparametric linear transformation models with interval-censored data.一种适用于带有区间删失数据的半参数线性变换模型的有效惩罚估计方法。

Stat Med. 2022 May 10;41(10):1829-1845. doi: 10.1002/sim.9331. Epub 2022 Jan 31.

Asymptotic behavior of an intrinsic rank-based estimator of the Pickands dependence function constructed from B-splines.基于B样条构建的Pickands相依函数的一种基于内在秩的估计量的渐近行为。

Extremes (Boston). 2023;26(1):101-138. doi: 10.1007/s10687-022-00451-9. Epub 2022 Nov 9.

引用本文的文献

AppleQSM: Geometry-Based 3D Characterization of Apple Tree Architecture in Orchards.苹果QSM：果园中基于几何形状的苹果树结构三维表征

Plant Phenomics. 2024 May 1;6:0179. doi: 10.34133/plantphenomics.0179. eCollection 2024.

本文引用的文献

A Model-free Variable Screening Method Based on Leverage Score.一种基于杠杆得分的无模型变量筛选方法。

J Am Stat Assoc. 2023;118(541):135-146. doi: 10.1080/01621459.2021.1918554. Epub 2021 Jun 21.

Biometrika. 2021 Mar;108(1):149-166. doi: 10.1093/biomet/asaa047. Epub 2020 Aug 27.

More efficient approximation of smoothing splines via space-filling basis selection.通过空间填充基选择对平滑样条进行更高效的近似。

Biometrika. 2020 Sep;107(3):723-735. doi: 10.1093/biomet/asaa019. Epub 2020 May 7.

Online Decentralized Leverage Score Sampling for Streaming Multidimensional Time Series.用于流式多维时间序列的在线分散杠杆评分抽样

Proc Mach Learn Res. 2019 Apr;89:2301-2311.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。