Wiktorski Tomasz, Królak Aleksandra
Department of Electrical Engineering and Computer Science, University of Stavanger, Norway.
Institute of Electronics, Lodz University of Technology, Poland.
MethodsX. 2020 Oct 9;7:101094. doi: 10.1016/j.mex.2020.101094. eCollection 2020.
Time series are a common data type in biomedical applications. Examples include heart rate, power output, and ECG. One of the typical analysis methods is to determine longest period a subject spent over a given heart rate threshold. While it might seem simple to find and measure such periods, biomedical data are often subject to significant noise and physiological artifacts. As a result, simple threshold calculations might not provide correct or expected results. A common way to improve such calculations is to use moving average filter. Length of the window is often determined using sum of absolute differences for various windows sizes. However, for real life biomedical data such approach might lead to extremely long windows that undesirably remove physiological information from the data. In this paper, we:•propose a new approach to finding windows length using zero-points of third gradient (jerk) of Sum of Absolute Differences method;•demonstrate how these points can be used to determine periods and area over a given threshold with and without uncertainty.We demonstrate validity of this approach on the PAMAP2 Physical Activity Monitoring Data Set, an open dataset from the UCI Machine Learning Repository, as well as on the PhysioNet Simultaneous Physiological Measurements dataset. It shows that first zero-point usually falls at around 8 and 5 second window length respectively, while second zero-point usually falls between 16 and 24 and 8-16 s respectively. The value for the first zero-point can remove simple measurement errors when data are recorded once every few seconds. The value for the second zero-point corresponds well with what is known about physiological response of heart to changing load.
时间序列是生物医学应用中常见的数据类型。示例包括心率、功率输出和心电图。典型的分析方法之一是确定受试者在给定心率阈值以上花费的最长时间。虽然找到并测量这些时间段看似简单,但生物医学数据往往受到大量噪声和生理伪迹的影响。因此,简单的阈值计算可能无法提供正确或预期的结果。一种改进此类计算的常用方法是使用移动平均滤波器。窗口长度通常使用各种窗口大小的绝对差之和来确定。然而,对于实际的生物医学数据,这种方法可能会导致窗口过长,从而不希望地从数据中去除生理信息。在本文中,我们:
提出一种使用绝对差之和方法的三阶梯度(加加速度)零点来找到窗口长度的新方法;
演示如何使用这些点来确定给定阈值以上的时间段和面积,包括有无不确定性的情况。
我们在PAMAP2身体活动监测数据集(UCI机器学习库中的一个开放数据集)以及PhysioNet同步生理测量数据集上证明了这种方法的有效性。结果表明,第一个零点通常分别落在大约8秒和5秒的窗口长度处,而第二个零点通常分别落在16至24秒和8至16秒之间。第一个零点的值可以在每隔几秒记录一次数据时消除简单的测量误差。第二个零点的值与心脏对变化负荷的生理反应所知情况非常吻合。