IEEE J Biomed Health Inform. 2021 Aug;25(8):2938-2947. doi: 10.1109/JBHI.2021.3064237. Epub 2021 Aug 5.
This paper presents and explores a robust deep learning framework for auscultation analysis. This aims to classify anomalies in respiratory cycles and detect diseases, from respiratory sound recordings. The framework begins with front-end feature extraction that transforms input sound into a spectrogram representation. Then, a back-end deep learning network is used to classify the spectrogram features into categories of respiratory anomaly cycles or diseases. Experiments, conducted over the ICBHI benchmark dataset of respiratory sounds, confirm three main contributions towards respiratory-sound analysis. Firstly, we carry out an extensive exploration of the effect of spectrogram types, spectral-time resolution, overlapping/non-overlapping windows, and data augmentation on final prediction accuracy. This leads us to propose a novel deep learning system, built on the proposed framework, which outperforms current state-of-the-art methods. Finally, we apply a Teacher-Student scheme to achieve a trade-off between model performance and model complexity which holds promise for building real-time applications.
本文提出并探索了一种用于听诊分析的强大深度学习框架。该框架旨在通过呼吸声音记录对呼吸周期中的异常进行分类并检测疾病。该框架首先进行前端特征提取,将输入声音转换为声谱图表示。然后,后端深度学习网络将声谱图特征分类为呼吸异常周期或疾病的类别。在 ICBHI 呼吸声音基准数据集上进行的实验证实了该框架在呼吸声音分析方面的三个主要贡献。首先,我们广泛探索了声谱图类型、光谱时间分辨率、重叠/非重叠窗口以及数据增强对最终预测准确性的影响。这促使我们提出了一种基于所提出框架的新型深度学习系统,该系统的性能优于当前最先进的方法。最后,我们应用了教师-学生方案来在模型性能和模型复杂性之间取得平衡,这为构建实时应用程序提供了希望。