IEEE J Biomed Health Inform. 2016 May;20(3):953-962. doi: 10.1109/JBHI.2015.2424711. Epub 2015 Apr 20.
In the last decade, data mining techniques have been applied to sensor data in a wide range of application domains, such as healthcare monitoring systems, manufacturing processes, intrusion detection, database management, and others. Many data mining techniques are based on computing the similarity between two sensor data patterns. A variety of representations and similarity measures for multiattribute time series have been proposed in the literature. In this paper, we describe a novel method for computing the similarity of two multiattribute time series based on a temporal version of Smith-Waterman (SW), a well-known bioinformatics algorithm. We then apply our method to sensor data from an eldercare application for early illness detection. Our method mitigates difficulties related to data uncertainty and aggregation that often arise when processing sensor data. The experiments take place at an aging-in-place facility, TigerPlace, located in Columbia, MO, USA. To validate our method, we used data from nonwearable sensor networks placed in TigerPlace apartments, combined with information from an electronic health record. We provide a set of experiments that investigate temporal version of SW properties, together with experiments on TigerPlace datasets. On a pilot sensor dataset from nine residents, with a total of 1902 days and around 2.1 million sensor hits of collected data, we obtained an average abnormal events prediction F-measure of 0.75.
在过去的十年中,数据挖掘技术已经被广泛应用于各种应用领域的传感器数据中,如医疗监测系统、制造业、入侵检测、数据库管理等。许多数据挖掘技术都基于计算两个传感器数据模式之间的相似性。文献中已经提出了多种多属性时间序列的表示和相似性度量方法。在本文中,我们描述了一种基于时间版 Smith-Waterman(SW)的计算两个多属性时间序列相似度的新方法,SW 是一种著名的生物信息学算法。然后,我们将我们的方法应用于老年护理应用程序中的传感器数据,以进行早期疾病检测。我们的方法减轻了在处理传感器数据时经常出现的数据不确定性和聚合相关的困难。实验在位于美国密苏里州哥伦比亚的老龄化居住设施 TigerPlace 进行。为了验证我们的方法,我们使用了来自非可穿戴传感器网络的数据,这些数据与电子健康记录中的信息相结合。我们提供了一组研究时间版 SW 属性的实验,以及对 TigerPlace 数据集的实验。在一个来自九位居民的试点传感器数据集上,总共 1902 天和大约 210 万次传感器数据采集,我们获得了平均异常事件预测 F 度量为 0.75。