AlMohimeed Abdulaziz
College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh, Saudi Arabia.
PeerJ Comput Sci. 2025 Jun 4;11:e2899. doi: 10.7717/peerj-cs.2899. eCollection 2025.
In recent years, Internet of Things (IoT)-based technologies have advanced healthcare by facilitating the development of monitoring systems, subsequently generating an exponential amount of streaming data. This streaming data can be preprocessed and analyzed using technologies that integrate ensemble models, Explainable Artificial Intelligence (XAI), feature selection (FS) method and big data streaming processing platforms to develop predictive real-time systems. This integration adds new value to healthcare that helps organizations enhance clinical decision-making, improve patient care, and elevate the overall quality of healthcare. This article presents a real-time system for the early detection and treatment of chronic kidney disease (CKD) using a real-world simulation application. The real-time system is developed in two phases. The first phase aims to propose a stacking model, apply a genetic algorithm (GA) and Particle swarm optimization (PSO) as feature selection, and explore a stacking model with the best features with explainable artificial intelligence (XAI). The best model with the best-optimized features is used to develop the second phase. The results showed that stacking model with GA is achieved the hightest performance with 100 accuracy, 100 precision, 100 recall, and 100 F1-score. The second phase is designed based on Confluent Cloud, which offers several benefits for creating a real-time streaming system based on Apache Kafka, providing multiple APIs-the Producer API and Consumer API-for data producers and consumers, respectively. Python scripts are developed to pipeline streaming data. The first Python script to generate streaming health attributes that are pushed into a Kafka topic. A second Python script to consume health attributes from a Kafka topic and apply a stacking model to predict CKD in real-time. The results showed that the stacking model with features selected by GA recorded the best performance with 100 accuracy. The pipeline's streaming steps have validated our approach's effectiveness in real-time, leveraging Confluent Cloud and Apache Kafka.
近年来,基于物联网(IoT)的技术通过推动监测系统的发展促进了医疗保健,随后产生了指数级的流数据。这些流数据可以使用集成了集成模型、可解释人工智能(XAI)、特征选择(FS)方法和大数据流处理平台的技术进行预处理和分析,以开发预测性实时系统。这种集成给医疗保健带来了新价值,有助于组织加强临床决策、改善患者护理并提升医疗保健的整体质量。本文介绍了一个使用真实世界模拟应用程序对慢性肾病(CKD)进行早期检测和治疗的实时系统。该实时系统分两个阶段开发。第一阶段旨在提出一个堆叠模型,应用遗传算法(GA)和粒子群优化(PSO)进行特征选择,并使用可解释人工智能(XAI)探索具有最佳特征的堆叠模型。具有最佳优化特征的最佳模型用于开发第二阶段。结果表明,采用GA的堆叠模型实现了最高性能,准确率、精确率、召回率和F1分数均为100。第二阶段基于Confluent Cloud设计,它为基于Apache Kafka创建实时流系统提供了诸多优势,分别为数据生产者和消费者提供了多个API——生产者API和消费者API。开发了Python脚本对流数据进行流水线处理。第一个Python脚本生成被推送到Kafka主题的流健康属性。第二个Python脚本从Kafka主题消费健康属性并应用堆叠模型实时预测CKD。结果表明,采用GA选择特征的堆叠模型记录了最佳性能,准确率为100。该流水线的流步骤利用Confluent Cloud和Apache Kafka实时验证了我们方法的有效性。