Wong Wei Ern, Wong An Hong, Peh Wei Qi, Tan Chee Keong
Monash University, Jalan Lagoon Selatan, Bandar Sunway, 47500 Subang Jaya, Selangor, Malaysia.
Data Brief. 2024 Jun 25;55:110673. doi: 10.1016/j.dib.2024.110673. eCollection 2024 Aug.
Human Activity Recognition (HAR) has emerged as a critical research area due to its extensive applications in various real-world domains. Numerous CSI-based datasets have been established to support the development and evaluation of advanced HAR algorithms. However, existing CSI-based HAR datasets are frequently limited by a dearth of complexity and diversity in the activities represented, hindering the design of robust HAR models. These limitations typically manifest as a narrow focus on a limited range of activities or the exclusion of factors influencing real-world CSI measurements. Consequently, the scarcity of diverse training data can impede the development of efficient HAR systems. To address the limitations of existing datasets, this paper introduces a novel dataset that captures spatial diversity through multiple transceiver orientations over a high dimensional space encompassing a large number of subcarriers. The dataset incorporates a wider range of real-world factors including extensive activity range, a spectrum of human movements (encompassing both micro-and macro-movements), variations in body composition, and diverse environmental conditions (noise and interference). The experiment is performed in a controlled laboratory environment with dimensions of 5 m (width) × 8 m (length) × 3 m (height) to capture CSI measurements for various human activities. Four ESP32-S3-DevKitC-1 devices, configured as transceiver pairs with unique Media Access Control (MAC) addresses, collect CSI data according to the Wi-Fi IEEE 802.11n standard. Mounted on tripods at a height of 1.5 m, the transmitter devices (powered by external power banks) positioned at north and east send multiple Wi-Fi beacons to their respective receivers (connected to laptops via USB for data collection) located at south and west. To capture multi-perspective CSI data, all six participants sequentially performed designated activities while standing in the centre of the tripod arrangement for 5 s per sample. The system collected approximately 300-450 packets per sample for approximately 1200 samples per activity, capturing CSI information across the 166 subcarriers employed in the Wi-Fi IEEE 802.11n standard. By leveraging the richness of this dataset, HAR researchers can develop more robust and generalizable CSI-based HAR models. Compared to traditional HAR approaches, these CSI-based models hold the promise of significantly enhanced accuracy and robustness when deployed in real-world scenarios. This stems from their ability to capture the nuanced dynamics of human movement through the analysis of wireless channel characteristic from different spatial variations (utilizing two-diagonal ESP32 transceivers configuration) with higher degree of dimensionality (166 subcarriers).
由于人类活动识别(HAR)在各种现实世界领域中的广泛应用,它已成为一个关键的研究领域。已经建立了许多基于信道状态信息(CSI)的数据集,以支持先进的HAR算法的开发和评估。然而,现有的基于CSI的HAR数据集常常受到所表示活动的复杂性和多样性不足的限制,这阻碍了强大的HAR模型的设计。这些限制通常表现为对有限范围活动的狭隘关注,或者排除影响现实世界CSI测量的因素。因此,多样化训练数据的稀缺会阻碍高效HAR系统的开发。为了解决现有数据集的局限性,本文引入了一个新颖的数据集,该数据集通过在包含大量子载波的高维空间中通过多个收发器方向来捕获空间多样性。该数据集纳入了更广泛的现实世界因素,包括广泛的活动范围、一系列人类运动(包括微观和宏观运动)、身体组成的变化以及不同的环境条件(噪声和干扰)。实验在一个尺寸为5米(宽)×8米(长)×3米(高)的受控实验室环境中进行,以捕获各种人类活动的CSI测量值。四个配置为具有唯一媒体访问控制(MAC)地址的收发器对的ESP32-S3-DevKitC-1设备,根据Wi-Fi IEEE 802.11n标准收集CSI数据。发射设备(由外部移动电源供电)安装在高度为1.5米的三脚架上,位于北面和东面,向位于南面和西面的各自接收器(通过USB连接到笔记本电脑进行数据收集)发送多个Wi-Fi信标。为了捕获多视角的CSI数据,所有六名参与者在三脚架排列的中心依次进行指定活动,每个样本持续5秒。该系统为每个活动大约1200个样本,每个样本收集大约300 - 450个数据包,捕获Wi-Fi IEEE 802.11n标准中使用的166个子载波上的CSI信息。通过利用这个数据集的丰富性,HAR研究人员可以开发出更强大、更具通用性的基于CSI的HAR模型。与传统的HAR方法相比,这些基于CSI的模型在部署到现实世界场景中时,有望显著提高准确性和鲁棒性。这源于它们能够通过分析来自不同空间变化(利用双对角线ESP32收发器配置)且具有更高维度(166个子载波)的无线信道特性来捕获人类运动的细微动态。