Annu Int Conf IEEE Eng Med Biol Soc. 2021 Nov;2021:80-83. doi: 10.1109/EMBC46164.2021.9630577.
Large annotated lung sound databases are publicly available and might be used to train algorithms for diagnosis systems. However, it might be a challenge to develop a well-performing algorithm for small non-public data, which have only a few subjects and show differences in recording devices and setup. In this paper, we use transfer learning to tackle the mismatch of the recording setup. This allows us to transfer knowledge from one dataset to another dataset for crackle detection in lung sounds. In particular, a single input convolutional neural network (CNN) model is pre-trained on a source domain using ICBHI 2017, the largest publicly available database of lung sounds. We use log-mel spectrogram features of respiratory cycles of lung sounds. The pre-trained network is used to build a multi-input CNN model, which shares the same network architecture for respiratory cycles and their corresponding respiratory phases. The multi-input model is then fine-tuned on the target domain of our self-collected lung sound database for classifying crackles and normal lung sounds. Our experimental results show significant performance improvements of 9.84% (absolute) in F-score on the target domain using the multi-input CNN model and transfer learning for crackle detection.Clinical relevance- Crackle detection in lung sounds, multi-input convolutional neural networks, transfer learning.
大型标注的肺部声音数据库是公开可用的,并且可以用于训练用于诊断系统的算法。然而,对于只有少数对象并且在记录设备和设置方面存在差异的小型非公开数据,开发性能良好的算法可能是一个挑战。在本文中,我们使用迁移学习来解决记录设置的不匹配问题。这使我们能够将知识从一个数据集转移到另一个数据集,以便在肺部声音中进行爆裂声检测。具体来说,我们使用 ICBHI 2017(最大的公开可用的肺部声音数据库)在源域上对单个输入卷积神经网络 (CNN) 模型进行预训练。我们使用呼吸周期的肺部声音的对数梅尔频谱图特征。使用预训练的网络构建多输入 CNN 模型,该模型为呼吸周期及其对应的呼吸阶段共享相同的网络架构。然后,多输入模型在我们自己收集的肺部声音数据库的目标域上进行微调,以对爆裂声和正常肺部声音进行分类。我们的实验结果表明,使用多输入 CNN 模型和迁移学习进行爆裂声检测,在目标域上的 F 分数提高了 9.84%(绝对值),具有显著的性能提升。
临床相关性-肺部声音中的爆裂声检测、多输入卷积神经网络、迁移学习。