Suppr超能文献

前向变量选择使基于 Karhunen-Loève 分解的高斯过程的快速、准确的动态系统识别成为可能。

Forward variable selection enables fast and accurate dynamic system identification with Karhunen-Loève decomposed Gaussian processes.

机构信息

National Energy Technology Laboratory, Morgantown, WV, United States of America.

Department of Mechanical and Aerospace Engineering, West Virginia University, Morgantown, WV, United States of America.

出版信息

PLoS One. 2024 Sep 20;19(9):e0309661. doi: 10.1371/journal.pone.0309661. eCollection 2024.

Abstract

A promising approach for scalable Gaussian processes (GPs) is the Karhunen-Loève (KL) decomposition, in which the GP kernel is represented by a set of basis functions which are the eigenfunctions of the kernel operator. Such decomposed kernels have the potential to be very fast, and do not depend on the selection of a reduced set of inducing points. However KL decompositions lead to high dimensionality, and variable selection thus becomes paramount. This paper reports a new method of forward variable selection, enabled by the ordered nature of the basis functions in the KL expansion of the Bayesian Smoothing Spline ANOVA kernel (BSS-ANOVA), coupled with fast Gibbs sampling in a fully Bayesian approach. It quickly and effectively limits the number of terms, yielding a method with competitive accuracies, training and inference times for tabular datasets of low feature set dimensionality. Theoretical computational complexities are [Formula: see text] in training and [Formula: see text] per point in inference, where N is the number of instances and P the number of expansion terms. The inference speed and accuracy makes the method especially useful for dynamic systems identification, by modeling the dynamics in the tangent space as a static problem, then integrating the learned dynamics using a high-order scheme. The methods are demonstrated on two dynamic datasets: a 'Susceptible, Infected, Recovered' (SIR) toy problem, along with the experimental 'Cascaded Tanks' benchmark dataset. Comparisons on the static prediction of time derivatives are made with a random forest (RF), a residual neural network (ResNet), and the Orthogonal Additive Kernel (OAK) inducing points scalable GP, while for the timeseries prediction comparisons are made with LSTM and GRU recurrent neural networks (RNNs) along with the SINDy package.

摘要

一种有前途的可扩展高斯过程 (GP) 方法是 Karhunen-Loève (KL) 分解,其中 GP 核由一组基函数表示,这些基函数是核算子的本征函数。这种分解后的核函数有可能非常快速,并且不依赖于选择一组较少的诱导点。然而,KL 分解会导致维度非常高,因此变量选择变得至关重要。本文报告了一种新的前向变量选择方法,该方法得益于 KL 展开的贝叶斯平滑样条方差分析核(BSS-ANOVA)的基函数的有序性质,以及完全贝叶斯方法中的快速吉布斯采样。它可以快速有效地限制项数,从而为具有低特征集维度的表格数据集提供具有竞争力的准确性、训练和推理时间的方法。理论计算复杂度在训练中为 [Formula: see text],在推理中为每点 [Formula: see text],其中 N 是实例的数量,P 是展开项的数量。该推理速度和准确性使得该方法特别适用于动态系统识别,方法是将动力学在切空间中的建模作为静态问题,然后使用高阶方案对学习到的动力学进行积分。该方法在两个动态数据集上进行了演示:一个“易感、感染、恢复”(SIR)玩具问题,以及实验性的“级联罐”基准数据集。对时间导数的静态预测进行了比较,比较的对象是随机森林 (RF)、残差神经网络 (ResNet) 和正交加性核 (OAK) 可扩展 GP,而对于时间序列预测则与 LSTM 和 GRU 递归神经网络 (RNN) 以及 SINDy 包进行了比较。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0f8b/11414993/4c5311cfa583/pone.0309661.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验