IEEE Trans Pattern Anal Mach Intell. 2023 Apr;45(4):4981-4996. doi: 10.1109/TPAMI.2022.3198411. Epub 2023 Mar 7.
Multivariate time series clustering has become an important research topic in the time series learning task, which aims to discover the correlation among multiple sequences and partition multivariate time series data into several subsets. Although there are currently some methods that can handle this task, most of them fail to discover informative subsequences from multivariate time series instances. In this paper, we first propose a novel unsupervised shapelet learning with adaptive neighbors (USLA) model for learning salient multivariate subsequences (i.e., multivariate shapelets), where the importance of each variate can be auto-determined when given a candidate multivariate shapelet. USLA performs multivariate shapelet-transformed representation learning and local structure learning simultaneously, but the performance of USLA with multivariate shapelets of different lengths is comparable to that of isometric multivariate shapelets. In fact, the shapelet-transformed representations learned from multivariate shapelets of different lengths can all represent multivariate time series instances separately and often contain complementary information to each other. Therefore, we develop a novel multiview USLA (MUSLA) model which treats shapelet-transformed representations learned from shapelets of different lengths as different views. In this way, MUSLA learns the importance of each view and the neighbor graph matrix among multiview representations when candidate multivariate shapelets of different lengths are determined. Experimental results show that MUSLA outperforms other state-of-the-art multivariate time series algorithms on real-world multivariate time series datasets.
多变量时间序列聚类已成为时间序列学习任务中的一个重要研究课题,旨在发现多个序列之间的相关性,并将多变量时间序列数据划分为几个子集。尽管目前有一些方法可以处理此任务,但大多数方法都无法从多变量时间序列实例中发现有信息量的子序列。在本文中,我们首先提出了一种新颖的无监督形状子学习与自适应邻居(USLA)模型,用于学习显著的多变量子序列(即多变量形状子),在给定候选多变量形状子时,可以自动确定每个变量的重要性。USLA 同时执行多变量形状子变换表示学习和局部结构学习,但具有不同长度的多变量形状子的 USLA 的性能与等距多变量形状子的性能相当。实际上,从不同长度的多变量形状子学习到的形状子变换表示都可以分别表示多变量时间序列实例,并且通常彼此互补。因此,我们开发了一种新颖的多视图 USLA(MUSLA)模型,将从不同长度的形状子学习到的形状子变换表示视为不同的视图。通过这种方式,MUSLA 在确定不同长度的候选多变量形状子时,学习每个视图的重要性和多视图表示之间的邻接图矩阵。实验结果表明,MUSLA 在真实多变量时间序列数据集上优于其他最新的多变量时间序列算法。