Science, Mathematics and Technology, Singapore University of Technology and Design, Singapore 487372, Singapore.
Department of Paediatric Anaesthesia, KK Women's and Children's Hospital, Singapore 229899, Singapore.
Sensors (Basel). 2021 Aug 18;21(16):5555. doi: 10.3390/s21165555.
Intelligent systems are transforming the world, as well as our healthcare system. We propose a deep learning-based cough sound classification model that can distinguish between children with healthy versus pathological coughs such as asthma, upper respiratory tract infection (URTI), and lower respiratory tract infection (LRTI). To train a deep neural network model, we collected a new dataset of cough sounds, labelled with a clinician's diagnosis. The chosen model is a bidirectional long-short-term memory network (BiLSTM) based on Mel-Frequency Cepstral Coefficients (MFCCs) features. The resulting trained model when trained for classifying two classes of coughs-healthy or pathology (in general or belonging to a specific respiratory pathology)-reaches accuracy exceeding 84% when classifying the cough to the label provided by the physicians' diagnosis. To classify the subject's respiratory pathology condition, results of multiple cough epochs per subject were combined. The resulting prediction accuracy exceeds 91% for all three respiratory pathologies. However, when the model is trained to classify and discriminate among four classes of coughs, overall accuracy dropped: one class of pathological coughs is often misclassified as the other. However, if one considers the healthy cough classified as healthy and pathological cough classified to have some kind of pathology, then the overall accuracy of the four-class model is above 84%. A longitudinal study of MFCC feature space when comparing pathological and recovered coughs collected from the same subjects revealed the fact that pathological coughs, irrespective of the underlying conditions, occupy the same feature space making it harder to differentiate only using MFCC features.
智能系统正在改变世界,也在改变我们的医疗保健系统。我们提出了一个基于深度学习的咳嗽声音分类模型,可以区分健康儿童和有病理的咳嗽,如哮喘、上呼吸道感染(URTI)和下呼吸道感染(LRTI)。为了训练深度神经网络模型,我们收集了一个新的咳嗽声音数据集,并由临床医生的诊断进行标注。选择的模型是基于梅尔频率倒谱系数(MFCC)特征的双向长短时记忆网络(BiLSTM)。当训练该模型对两种咳嗽类型(健康或病理(通常或属于特定呼吸道病理))进行分类时,当将咳嗽分类到医生诊断提供的标签时,准确率超过 84%。为了对受试者的呼吸道病理状况进行分类,对每个受试者的多个咳嗽时段的结果进行了组合。对于所有三种呼吸道病理,预测准确率超过 91%。然而,当模型被训练为对四类咳嗽进行分类和区分时,整体准确性下降:一类病理咳嗽通常被错误地分类为另一类。然而,如果将健康咳嗽分类为健康,将有某种病理的咳嗽分类为有某种病理,那么四类模型的整体准确性超过 84%。当比较从同一受试者收集的病理和恢复咳嗽的 MFCC 特征空间的纵向研究揭示了这样一个事实,即病理咳嗽,无论其潜在条件如何,都占据相同的特征空间,仅使用 MFCC 特征很难进行区分。