通过增强数据集提高呼吸异常的机器学习分类准确性。

Improving Machine Learning Classification Accuracy for Breathing Abnormalities by Enhancing Dataset.

机构信息

Department of Electrical Engineering, HITEC University, Taxila 47080, Pakistan.

Department of Electrical and Computer Engineering, COMSATS University Islamabad, Attock Campus, Attock 43600, Pakistan.

出版信息

Sensors (Basel). 2021 Oct 12;21(20):6750. doi: 10.3390/s21206750.

Abstract

The recent severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), also known as coronavirus disease (COVID)-19, has appeared as a global pandemic with a high mortality rate. The main complication of COVID-19 is rapid respirational deterioration, which may cause life-threatening pneumonia conditions. Global healthcare systems are currently facing a scarcity of resources to assist critical patients simultaneously. Indeed, non-critical patients are mostly advised to self-isolate or quarantine themselves at home. However, there are limited healthcare services available during self-isolation at home. According to research, nearly 20-30% of COVID patients require hospitalization, while almost 5-12% of patients may require intensive care due to severe health conditions. This pandemic requires global healthcare systems that are intelligent, secure, and reliable. Tremendous efforts have been made already to develop non-contact sensing technologies for the diagnosis of COVID-19. The most significant early indication of COVID-19 is rapid and abnormal breathing. In this research work, RF-based technology is used to collect real-time breathing abnormalities data. Subsequently, based on this data, a large dataset of simulated breathing abnormalities is generated using the curve fitting technique for developing a machine learning (ML) classification model. The advantages of generating simulated breathing abnormalities data are two-fold; it will help counter the daunting and time-consuming task of real-time data collection and improve the ML model accuracy. Several ML algorithms are exploited to classify eight breathing abnormalities: eupnea, bradypnea, tachypnea, Biot, sighing, Kussmaul, Cheyne-Stokes, and central sleep apnea (CSA). The performance of ML algorithms is evaluated based on accuracy, prediction speed, and training time for real-time breathing data and simulated breathing data. The results show that the proposed platform for real-time data classifies breathing patterns with a maximum accuracy of 97.5%, whereas by introducing simulated breathing data, the accuracy increases up to 99.3%. This work has a notable medical impact, as the introduced method mitigates the challenge of data collection to build a realistic model of a large dataset during the pandemic.

摘要

新型冠状病毒(SARS-CoV-2)又称 COVID-19,近期出现并在全球蔓延,其死亡率较高。COVID-19 的主要并发症是呼吸恶化,可能导致危及生命的肺炎。目前,全球医疗体系面临着资源短缺的问题,难以同时对重症患者进行救治。事实上,轻症和无症状患者大多被建议在家中进行自我隔离或检疫。然而,在家中隔离期间可获得的医疗服务十分有限。据研究,约 20-30%的 COVID 患者需要住院治疗,而近 5-12%的患者可能因病情严重需要重症监护。此次疫情需要智能化、安全可靠的全球医疗体系。目前,已经投入大量精力开发用于 COVID-19 诊断的非接触式传感技术。COVID-19 的最早显著迹象是呼吸急促和异常。在这项研究工作中,使用射频(RF)技术来收集实时呼吸异常数据。随后,基于这些数据,使用曲线拟合技术生成模拟呼吸异常的大型数据集,以开发机器学习(ML)分类模型。生成模拟呼吸异常数据有两个优势:它将有助于应对实时数据收集这一艰巨且耗时的任务,并提高 ML 模型的准确性。研究中利用了几种 ML 算法来对八种呼吸异常进行分类:正常呼吸、呼吸过缓、呼吸过速、Biot 呼吸、叹息、Kussmaul 呼吸、Cheyne-Stokes 呼吸和中枢性睡眠呼吸暂停(CSA)。根据准确性、预测速度和实时数据以及模拟数据的训练时间来评估 ML 算法的性能。结果表明,实时数据分类平台的最大准确率为 97.5%,而通过引入模拟呼吸数据,准确率提高到 99.3%。这项工作具有重要的医学影响,因为所提出的方法缓解了在大流行期间建立大型数据集的现实模型的数据收集挑战。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a01d/8538545/4eb4751630a4/sensors-21-06750-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索