Suppr超能文献

人工智能生成的跌倒数据:评估用于可穿戴式跌倒检测的语言模型和扩散模型

AI-Generated Fall Data: Assessing LLMs and Diffusion Model for Wearable Fall Detection.

作者信息

Alamgeer Sana, Souissi Yasine, Ngu Anne

机构信息

Department of Computer Science, Texas State University, San Marcos, TX 78666, USA.

College of Computing and Informatics, University of North Carolina, Charlotte, NC 28223, USA.

出版信息

Sensors (Basel). 2025 Aug 19;25(16):5144. doi: 10.3390/s25165144.

Abstract

Training fall detection systems is challenging due to the scarcity of real-world fall data, particularly from elderly individuals. To address this, we explore the potential of Large Language Models (LLMs) for generating synthetic fall data. This study evaluates text-to-motion (T2M, SATO, and ParCo) and text-to-text models (GPT4o, GPT4, and Gemini) in simulating realistic fall scenarios. We generate synthetic datasets and integrate them with four real-world baseline datasets to assess their impact on fall detection performance using a Long Short-Term Memory (LSTM) model. Additionally, we compare LLM-generated synthetic data with a diffusion-based method to evaluate their alignment with real accelerometer distributions. Results indicate that dataset characteristics significantly influence the effectiveness of synthetic data, with LLM-generated data performing best in low-frequency settings (e.g., 20 Hz) while showing instability in high-frequency datasets (e.g., 200 Hz). While text-to-motion models produce more realistic biomechanical data than text-to-text models, their impact on fall detection varies. Diffusion-based synthetic data demonstrates the closest alignment to real data but does not consistently enhance model performance. An ablation study further confirms that the effectiveness of synthetic data depends on sensor placement and fall representation. These findings provide insights into optimizing synthetic data generation for fall detection models.

摘要

由于现实世界中跌倒数据稀缺,尤其是来自老年人的跌倒数据,训练跌倒检测系统具有挑战性。为了解决这一问题,我们探索了大语言模型(LLMs)生成合成跌倒数据的潜力。本研究评估了文本到运动模型(T2M、SATO和ParCo)以及文本到文本模型(GPT4o、GPT4和Gemini)在模拟现实跌倒场景方面的表现。我们生成了合成数据集,并将其与四个现实世界的基线数据集相结合,以使用长短期记忆(LSTM)模型评估它们对跌倒检测性能的影响。此外,我们将大语言模型生成的合成数据与基于扩散的方法进行比较,以评估它们与真实加速度计分布的一致性。结果表明,数据集特征对合成数据的有效性有显著影响,大语言模型生成的数据在低频设置(例如20Hz)下表现最佳,而在高频数据集(例如200Hz)中表现不稳定。虽然文本到运动模型比文本到文本模型产生更逼真的生物力学数据,但它们对跌倒检测的影响各不相同。基于扩散的合成数据与真实数据的一致性最接近,但并不能始终提高模型性能。一项消融研究进一步证实,合成数据的有效性取决于传感器的放置和跌倒的表示方式。这些发现为优化跌倒检测模型的合成数据生成提供了见解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b792/12390476/9c864bc71ecc/sensors-25-05144-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验