IEEE Trans Pattern Anal Mach Intell. 2018 May;40(5):1128-1138. doi: 10.1109/TPAMI.2017.2710047. Epub 2017 Jun 8.
Machine learning algorithms for the analysis of time-series often depend on the assumption that utilised data are temporally aligned. Any temporal discrepancies arising in the data is certain to lead to ill-generalisable models, which in turn fail to correctly capture properties of the task at hand. The temporal alignment of time-series is thus a crucial challenge manifesting in a multitude of applications. Nevertheless, the vast majority of algorithms oriented towards temporal alignment are either applied directly on the observation space or simply utilise linear projections-thus failing to capture complex, hierarchical non-linear representations that may prove beneficial, especially when dealing with multi-modal data (e.g., visual and acoustic information). To this end, we present Deep Canonical Time Warping (DCTW), a method that automatically learns non-linear representations of multiple time-series that are (i) maximally correlated in a shared subspace, and (ii) temporally aligned. Furthermore, we extend DCTW to a supervised setting, where during training, available labels can be utilised towards enhancing the alignment process. By means of experiments on four datasets, we show that the representations learnt significantly outperform state-of-the-art methods in temporal alignment, elegantly handling scenarios with heterogeneous feature sets, such as the temporal alignment of acoustic and visual information.
用于分析时间序列的机器学习算法通常依赖于这样一个假设,即所使用的数据在时间上是对齐的。如果数据中存在任何时间上的差异,那么生成的模型肯定是不可泛化的,这反过来又无法正确捕捉当前任务的特性。因此,时间序列的时间对齐是在许多应用中表现出来的一个关键挑战。然而,绝大多数面向时间对齐的算法要么直接应用于观测空间,要么仅仅利用线性投影,从而无法捕捉到复杂的、层次化的非线性表示,这些表示在处理多模态数据(例如视觉和声学信息)时可能是有益的。为此,我们提出了深度正则时间扭曲(DCTW)方法,该方法可以自动学习多个时间序列的非线性表示,这些表示(i)在共享子空间中具有最大相关性,(ii)在时间上是对齐的。此外,我们将 DCTW 扩展到了一个有监督的设置中,在训练过程中,可以利用可用的标签来增强对齐过程。通过在四个数据集上的实验,我们表明,所学习到的表示在时间对齐方面显著优于最先进的方法,优雅地处理了具有异构特征集的场景,例如声学和视觉信息的时间对齐。