Suppr超能文献

用于多维时间序列的领域无关在线语义分割

Domain agnostic online semantic segmentation for multi-dimensional time series.

作者信息

Gharghabi Shaghayegh, Yeh Chin-Chia Michael, Ding Yifei, Ding Wei, Hibbing Paul, LaMunion Samuel, Kaplan Andrew, Crouter Scott E, Keogh Eamonn

机构信息

1Department of Computer Science and Engineering, University of California, Riverside, USA.

2Department of Computer Science, University of Massachusetts Boston, Boston, USA.

出版信息

Data Min Knowl Discov. 2019;33(1):96-130. doi: 10.1007/s10618-018-0589-3. Epub 2018 Sep 25.

Abstract

Unsupervised semantic segmentation in the time series domain is a much studied problem due to its potential to detect unexpected regularities and regimes in poorly understood data. However, the current techniques have several shortcomings, which have limited the adoption of time series semantic segmentation beyond academic settings for four primary reasons. First, most methods require setting/learning many parameters and thus may have problems generalizing to novel situations. Second, most methods implicitly assume that all the data is segmentable and have difficulty when that assumption is unwarranted. Thirdly, many algorithms are only defined for the single dimensional case, despite the ubiquity of multi-dimensional data. Finally, most research efforts have been confined to the batch case, but online segmentation is clearly more useful and actionable. To address these issues, we present a multi-dimensional algorithm, which is domain agnostic, has only one, easily-determined parameter, and can handle data streaming at a high rate. In this context, we test the algorithm on the largest and most diverse collection of time series datasets ever considered for this task and demonstrate the algorithm's superiority over current solutions.

摘要

由于在理解不足的数据中检测意外规律和模式的潜力,时间序列领域中的无监督语义分割是一个备受研究的问题。然而,当前技术存在几个缺点,这限制了时间序列语义分割在学术环境之外的应用,主要有四个原因。首先,大多数方法需要设置/学习许多参数,因此在推广到新情况时可能会有问题。其次,大多数方法隐含地假设所有数据都是可分割的,当该假设不成立时会遇到困难。第三,尽管多维数据无处不在,但许多算法仅针对单维情况定义。最后,大多数研究工作都局限于批处理情况,但在线分割显然更有用且可操作。为了解决这些问题,我们提出了一种多维算法,该算法与领域无关,只有一个易于确定的参数,并且能够高速处理数据流。在此背景下,我们在有史以来针对此任务考虑的最大、最多样化的时间序列数据集集合上测试了该算法,并证明了该算法相对于当前解决方案的优越性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eabc/6373324/9a3e76d7cc22/10618_2018_589_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验