Khattach Ouiam, Moussaoui Omar, Hassine Mohammed
Department of Informatics, MATSI Laboratory EST, University Mohammed First, Oujda 60000, Morocco.
Tisalabs Limited, T12 Y275 Cork, Ireland.
Sensors (Basel). 2025 May 7;25(9):2945. doi: 10.3390/s25092945.
The rapid proliferation of Internet of Things (IoT) devices across industries has created a need for robust, scalable, and real-time data processing architectures capable of supporting intelligent analytics and predictive maintenance. This paper presents a novel comprehensive architecture that enables end-to-end processing of IoT data streams, from acquisition to actionable insights. The system integrates Kafka-based message brokering for the high-throughput ingestion of real-time sensor data, with Apache Spark facilitating batch and stream extraction, transformation, and loading (ETL) processes. A modular machine-learning pipeline handles automated data preprocessing, training, and evaluation across various models. The architecture incorporates continuous monitoring and optimization components to track system performance and model accuracy, feeding insights to users via a dedicated Application Programming Interface (API). The design ensures scalability, flexibility, and real-time responsiveness, making it well suited for industrial IoT applications requiring continuous monitoring and intelligent decision-making.
物联网(IoT)设备在各行业的迅速普及,催生了对强大、可扩展且能支持智能分析和预测性维护的实时数据处理架构的需求。本文提出了一种新颖的综合架构,该架构能够实现从物联网数据流采集到可操作洞察的端到端处理。该系统集成了基于Kafka的消息代理,用于实时传感器数据的高吞吐量摄取,并借助Apache Spark促进批处理和流提取、转换及加载(ETL)过程。一个模块化的机器学习管道负责跨各种模型进行自动化数据预处理、训练和评估。该架构包含持续监控和优化组件,以跟踪系统性能和模型准确性,并通过专用应用程序编程接口(API)将洞察结果反馈给用户。这种设计确保了可扩展性、灵活性和实时响应能力,使其非常适合需要持续监控和智能决策的工业物联网应用。