Department of Digital Systems, University of Piraeus, M. Karaoli & A. Dimitriou 80, 18534, Piraeus, Greece.
Department of Digital Systems, University of Piraeus, M. Karaoli & A. Dimitriou 80, 18534, Piraeus, Greece.
Comput Methods Programs Biomed. 2019 Nov;181:104967. doi: 10.1016/j.cmpb.2019.06.026. Epub 2019 Jun 29.
Healthcare 4.0 is being hailed as the current industrial revolution in the healthcare domain, dealing with billions of heterogeneous IoT data sources that are connected over the Internet and aim at providing real-time health-related information for citizens and patients. It is of major importance to utilize an automated way to identify the quality levels of these data sources, in order to obtain reliable health data.
In this manuscript, we demonstrate an innovative mechanism for assessing the quality of various datasets in correlation with the quality of the corresponding data sources. For that purpose, the mechanism follows a 5-stepped approach through which the available data sources are detected, identified and connected to health platforms, where finally their data is gathered. Once the data is obtained, the mechanism cleans it and correlates it with the quality measurements that are captured from each different data source, in order to finally decide whether these data sources are being characterized as qualitative or not, and thus their data is kept for further analysis.
The proposed mechanism is evaluated through an experiment using a sample of 18 existing heterogeneous medical data sources. Based on the captured results, we were able to identify a data source of unknown type, recognizing that it was a body weight scale. Afterwards, we were able to find out that the API method that was responsible for gathering data out of this data source was the getMeasurements() method, while combining both the body weight scale's quality and its derived data quality, we could decide that this data source was considered as qualitative enough.
By taking full advantage of capturing the quality of a data source through measuring and correlating both the data source's quality itself and the quality of its derived data, the proposed mechanism provides efficient results, being able to ensure end-to-end both data sources and data quality.
医疗 4.0 被誉为医疗领域的当前工业革命,它涉及数十亿个异构的物联网数据源,这些数据源通过互联网连接,旨在为公民和患者提供实时健康相关信息。利用自动化方式识别这些数据源的质量水平,以获取可靠的健康数据,这一点非常重要。
在本文中,我们展示了一种评估各种数据集质量与相应数据源质量之间相关性的创新机制。为此,该机制采用了五步方法,通过该方法可以检测、识别和连接可用数据源,并将其连接到健康平台,最终收集其数据。一旦获得数据,该机制就会对其进行清理,并将其与从每个不同数据源捕获的质量测量值相关联,以便最终确定这些数据源是否定性,并保留其数据进行进一步分析。
该机制通过使用 18 个现有的异构医疗数据源的样本进行实验进行了评估。根据捕获的结果,我们能够识别出未知类型的数据来源,确定它是一个体重秤。之后,我们发现负责从该数据源收集数据的 API 方法是 getMeasurements()方法,同时结合体重秤的质量及其衍生数据质量,我们可以决定该数据源被认为具有足够的质量。
通过充分利用通过测量和关联数据源自身质量及其衍生数据质量来捕获数据源质量的优势,所提出的机制提供了高效的结果,能够确保端到端的数据来源和数据质量。