Department of Computer Science and Software Engineering, College of Information Technology, UAE University, Al Ain P.O. Box 15551, United Arab Emirates.
College of Computing and Informatics, Sharjah University, Sharjah P.O. Box 27272, United Arab Emirates.
Sensors (Basel). 2023 Jul 16;23(14):6443. doi: 10.3390/s23146443.
Continuous monitoring of patients involves collecting and analyzing sensory data from a multitude of sources. To overcome communication overhead, ensure data privacy and security, reduce data loss, and maintain efficient resource usage, the processing and analytics are moved close to where the data are located (e.g., the edge). However, data quality (DQ) can be degraded because of imprecise or malfunctioning sensors, dynamic changes in the environment, transmission failures, or delays. Therefore, it is crucial to keep an eye on data quality and spot problems as quickly as possible, so that they do not mislead clinical judgments and lead to the wrong course of action. In this article, a novel approach called federated data quality profiling (FDQP) is proposed to assess the quality of the data at the edge. FDQP is inspired by federated learning (FL) and serves as a condensed document or a guide for node data quality assurance. The FDQP formal model is developed to capture the quality dimensions specified in the data quality profile (DQP). The proposed approach uses federated feature selection to improve classifier precision and rank features based on criteria such as feature value, outlier percentage, and missing data percentage. Extensive experimentation using a fetal dataset split into different edge nodes and a set of scenarios were carefully chosen to evaluate the proposed FDQP model. The results of the experiments demonstrated that the proposed FDQP approach positively improved the DQ, and thus, impacted the accuracy of the federated patient similarity network (FPSN)-based machine learning models. The proposed data-quality-aware federated PSN architecture leveraging FDQP model with data collected from edge nodes can effectively improve the data quality and accuracy of the federated patient similarity network (FPSN)-based machine learning models. Our profiling algorithm used lightweight profile exchange instead of full data processing at the edge, which resulted in optimal data quality achievement, thus improving efficiency. Overall, FDQP is an effective method for assessing data quality in the edge computing environment, and we believe that the proposed approach can be applied to other scenarios beyond patient monitoring.
连续监测患者需要从多个来源收集和分析传感器数据。为了克服通信开销、确保数据隐私和安全、减少数据丢失并保持高效的资源利用,处理和分析被移近到数据所在的位置(例如边缘)。然而,由于传感器不准确或出现故障、环境动态变化、传输失败或延迟,数据质量(DQ)可能会降低。因此,密切关注数据质量并尽快发现问题至关重要,以免误导临床判断并导致采取错误的行动。在本文中,提出了一种名为联邦数据质量分析(FDQP)的新方法,用于评估边缘数据的质量。FDQP 受到联邦学习(FL)的启发,是节点数据质量保证的精简文档或指南。FDQP 正式模型用于捕获数据质量档案(DQP)中指定的质量维度。所提出的方法使用联邦特征选择来提高分类器的精度,并根据特征值、异常百分比和缺失数据百分比等标准对特征进行排名。使用不同边缘节点的胎儿数据集和一组精心选择的场景进行了广泛的实验,以评估所提出的 FDQP 模型。实验结果表明,所提出的 FDQP 方法可积极提高 DQ,从而影响基于联邦患者相似性网络(FPSN)的机器学习模型的准确性。所提出的数据质量感知联邦 PSN 架构利用从边缘节点收集的数据和 FDQP 模型,可以有效地提高基于联邦患者相似性网络(FPSN)的机器学习模型的数据质量和准确性。我们的分析算法使用轻量级配置文件交换而不是在边缘进行全数据处理,从而实现了最佳的数据质量,并提高了效率。总体而言,FDQP 是一种在边缘计算环境中评估数据质量的有效方法,我们相信,该方法可以应用于患者监测以外的其他场景。