Pioli Laercio, de Macedo Douglas D J, Costa Daniel G, Dantas Mario A R
INE, Computer Science Department, Federal University of Santa Catarina, Florianopolis 88040-370, Brazil.
Department of Information Science, Federal University of Santa Catarina, Florianopolis 88040-370, Brazil.
Sensors (Basel). 2024 Jan 7;24(2):358. doi: 10.3390/s24020358.
The accelerated development of technologies within the Internet of Things landscape has led to an exponential boost in the volume of heterogeneous data generated by interconnected sensors, particularly in scenarios with multiple data sources as in smart cities. Transferring, processing, and storing a vast amount of sensed data poses significant challenges for Internet of Things systems. In this sense, data reduction techniques based on artificial intelligence have emerged as promising solutions to address these challenges, alleviating the burden on the required storage, bandwidth, and computational resources. This article proposes a framework that exploits the concept of data reduction to decrease the amount of heterogeneous data in certain applications. A machine learning model that predicts a distortion rate and its corresponding reduction rate of the imputed data is also proposed, which uses the predicted values to select, among many reduction techniques, the most suitable approach. To support such a decision, the model also considers the context of the data producer that dictates the class of reduction algorithm that is allowed to be applied to the input stream. The achieved results indicate that the Huffman algorithm performed better considering the reduction of time-series data, with significant potential applications for smart city scenarios.
物联网领域技术的加速发展,使得互联传感器产生的异构数据量呈指数级增长,尤其是在智能城市等具有多个数据源的场景中。传输、处理和存储大量传感数据给物联网系统带来了巨大挑战。从这个意义上讲,基于人工智能的数据缩减技术已成为应对这些挑战的有前景的解决方案,减轻了对所需存储、带宽和计算资源的负担。本文提出了一个利用数据缩减概念来减少某些应用中异构数据量的框架。还提出了一种机器学习模型,该模型预测插补数据的失真率及其相应的缩减率,并使用预测值在众多缩减技术中选择最合适的方法。为支持这一决策,该模型还考虑了数据生产者的上下文,该上下文决定了允许应用于输入流的缩减算法类别。取得的结果表明,考虑到时间序列数据的缩减,霍夫曼算法表现更好,在智能城市场景中具有重要的潜在应用价值。