Lehman Lh, Saeed M, Moody Gb, Mark Rg
Harvard-MIT Division of Health Sciences and Technology, Cambridge, MA, USA.
Comput Cardiol. 2008;35(4749126):653-656. doi: 10.1109/CIC.2008.4749126.
We present a similarity-based searching and pattern matching algorithm that identifies time series data with similar temporal dynamics in large-scale, multi-parameter databases. We represent time series segments by feature vectors that reflect the dynamical patterns of single and multi-dimensional physiological time series. Features include regression slopes at varying time scales, maximum transient changes, auto-correlation coefficients of individual signals, and cross correlations among multiple signals. We model the dynamical patterns with a Gaussian mixture model (GMM) learned with the Expectation Maximization algorithm, and compute similarity between segments as Mahalanobis distances. We evaluate the use of our algorithm in three applications: search-by-example based data retrieval, event classification, and forecasting, using synthetic and real physiologic time series from a variety of sources.
我们提出了一种基于相似度的搜索和模式匹配算法,该算法可在大规模多参数数据库中识别具有相似时间动态的时间序列数据。我们用特征向量来表示时间序列段,这些特征向量反映了单维和多维生理时间序列的动态模式。特征包括不同时间尺度下的回归斜率、最大瞬态变化、单个信号的自相关系数以及多个信号之间的互相关。我们使用期望最大化算法学习的高斯混合模型(GMM)对动态模式进行建模,并将段之间的相似度计算为马氏距离。我们使用来自各种来源的合成和真实生理时间序列,在三个应用中评估我们算法的使用情况:基于示例的搜索数据检索、事件分类和预测。