Varga Domonkos
Nokia Bell Labs, 1082 Budapest, Hungary.
Sensors (Basel). 2024 Dec 22;24(24):8201. doi: 10.3390/s24248201.
Human action recognition using WiFi channel state information (CSI) has gained attention due to its non-intrusive nature and potential applications in healthcare, smart environments, and security. However, the reliability of methods developed for CSI-based action recognition is often contingent on the quality of the datasets and evaluation protocols used. In this paper, we uncovered a critical data leakage issue, which arises from improper data partitioning, in a widely used WiFi CSI benchmark dataset. Specifically, the benchmark fails to separate individuals between the training and test sets, leading to inflated performance metrics as models inadvertently learn individual-specific features rather than generalizable action patterns. We analyzed this issue in depth, retrained several benchmarked models using corrected data partitioning methods, and demonstrated a significant drop in accuracy when individuals were properly separated across training and testing. Our findings highlight the importance of rigorous data partitioning in CSI-based action recognition and provide recommendations for mitigating data leakage in future research. This work contributes to the development of more robust and reliable human action recognition systems using WiFi CSI.
利用WiFi信道状态信息(CSI)进行人体动作识别因其非侵入性以及在医疗保健、智能环境和安全领域的潜在应用而受到关注。然而,为基于CSI的动作识别所开发方法的可靠性通常取决于所使用数据集和评估协议的质量。在本文中,我们在一个广泛使用的WiFi CSI基准数据集中发现了一个由不当数据划分引起的关键数据泄露问题。具体而言,该基准未能在训练集和测试集之间分离个体,导致性能指标虚高,因为模型无意中学习到了个体特定特征而非可推广的动作模式。我们深入分析了这个问题,使用校正后的数据划分方法对几个基准模型进行了重新训练,并证明当在训练和测试中正确分离个体时,准确率会显著下降。我们的研究结果凸显了在基于CSI的动作识别中进行严格数据划分的重要性,并为未来研究中减轻数据泄露提供了建议。这项工作有助于开发更强大、可靠的利用WiFi CSI的人体动作识别系统。