Martinez-Cruz Carmen, Rueda Antonio J, Popescu Mihail, Keller James M
Department of Computer Science, University of Jaén, Spain.
Health Management and Informatics, University of Missouri, USA.
IEEE Trans Fuzzy Syst. 2022 Apr;30(4):1048-1059. doi: 10.1109/tfuzz.2021.3052107. Epub 2021 Jan 18.
Time series analysis has been an active area of research for years, with important applications in forecasting or discovery of hidden information such as patterns or anomalies in observed data. In recent years, the use of time series analysis techniques for the generation of descriptions and summaries in natural language of any variable, such as temperature, heart rate or CO2 emission has received increasing attention. Natural language has been recognized as more effective than traditional graphical representations of numerical data in many cases, in particular in situations where a large amount of data needs to be inspected or when the user lacks the necessary background and skills to interpret it. In this work, we describe a novel mechanism to generate linguistic descriptions of time series using natural language and fuzzy logic techniques. The proposed method generates quality summaries capturing the time series features that are relevant for a user in a particular application, and can be easily customized for different domains. This approach has been successfully applied to the generation of linguistic descriptions of bed restlessness data from residents at TigerPlace (Columbia, Missouri), which is used as a case study to illustrate the modeling process and show the quality of the descriptions obtained.
多年来,时间序列分析一直是一个活跃的研究领域,在预测或发现隐藏信息(如观测数据中的模式或异常)方面有重要应用。近年来,使用时间序列分析技术以自然语言生成任何变量(如温度、心率或二氧化碳排放)的描述和总结受到了越来越多的关注。在许多情况下,自然语言已被认为比数值数据的传统图形表示更有效,特别是在需要检查大量数据的情况下或是用户缺乏解释数据所需的必要背景和技能时。在这项工作中,我们描述了一种使用自然语言和模糊逻辑技术生成时间序列语言描述的新颖机制。所提出的方法生成了能够捕捉特定应用中与用户相关的时间序列特征的高质量总结,并且可以很容易地针对不同领域进行定制。这种方法已成功应用于生成来自密苏里州哥伦比亚市泰格广场居民的卧床不安数据的语言描述,该数据用作案例研究来说明建模过程并展示所获得描述的质量。