Suppr超能文献

一种用于变化点检测的计算效率高的非参数方法。

A computationally efficient nonparametric approach for changepoint detection.

作者信息

Haynes Kaylea, Fearnhead Paul, Eckley Idris A

机构信息

1STOR-i Centre for Doctoral Training, Lancaster University, Lancaster, UK.

2Department of Mathematics and Statistics, Lancaster University, Lancaster, UK.

出版信息

Stat Comput. 2017;27(5):1293-1305. doi: 10.1007/s11222-016-9687-5. Epub 2016 Jul 28.

Abstract

In this paper we build on an approach proposed by Zou et al. (2014) for nonparametric changepoint detection. This approach defines the best segmentation for a data set as the one which minimises a penalised cost function, with the cost function defined in term of minus a non-parametric log-likelihood for data within each segment. Minimising this cost function is possible using dynamic programming, but their algorithm had a computational cost that is cubic in the length of the data set. To speed up computation, Zou et al. (2014) resorted to a screening procedure which means that the estimated segmentation is no longer guaranteed to be the global minimum of the cost function. We show that the screening procedure adversely affects the accuracy of the changepoint detection method, and show how a faster dynamic programming algorithm, pruned exact linear time (PELT) (Killick et al. 2012), can be used to find the optimal segmentation with a computational cost that can be close to linear in the amount of data. PELT requires a penalty to avoid under/over-fitting the model which can have a detrimental effect on the quality of the detected changepoints. To overcome this issue we use a relatively new method, changepoints over a range of penalties (Haynes et al. 2016), which finds all of the optimal segmentations for multiple penalty values over a continuous range. We apply our method to detect changes in heart-rate during physical activity.

摘要

在本文中,我们基于邹等人(2014年)提出的非参数变点检测方法展开研究。该方法将数据集的最佳分割定义为能使惩罚成本函数最小化的分割,其中成本函数是根据每个分段内数据的负非参数对数似然来定义的。使用动态规划可以使这个成本函数最小化,但他们的算法的计算成本与数据集长度的立方成正比。为了加快计算速度,邹等人(2014年)采用了一种筛选程序,这意味着估计的分割不再保证是成本函数的全局最小值。我们表明,这种筛选程序会对变点检测方法的准确性产生不利影响,并展示了如何使用一种更快的动态规划算法——剪枝精确线性时间(PELT)算法(基利克等人,2012年),以接近数据量线性的计算成本来找到最优分割。PELT需要一个惩罚项来避免模型欠拟合或过拟合,这可能会对检测到的变点质量产生不利影响。为了克服这个问题,我们使用了一种相对较新的方法——一系列惩罚下的变点(海恩斯等人,2016年),该方法能找到连续范围内多个惩罚值的所有最优分割。我们将我们的方法应用于检测身体活动期间心率的变化。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验