Poh Yang Yi, Grooby Ethan, Tan Kenneth, Zhou Lindsay, King Arrabella, Ramanathan Ashwin, Malhotra Atul, Harandi Mehrtash, Marzbanrad Faezeh
Department of Electrical and Computer Systems EngineeringMonash University, Melbourne Clayton VIC 3800 Australia.
BC Children's Hospital Research Institute and the Department of Electrical and Computer EngineeringUniversity of British Columbia Vancouver BC V6T 1Z4 Canada.
IEEE Open J Eng Med Biol. 2024 May 15;5:345-352. doi: 10.1109/OJEMB.2024.3401571. eCollection 2024.
Auscultation for neonates is a simple and non-invasive method of diagnosing cardiovascular and respiratory disease. However, obtaining high-quality chest sounds containing only heart or lung sounds is non-trivial. Hence, this study introduces a new deep-learning model named NeoSSNet and evaluates its performance in neonatal chest sound separation with previous methods. We propose a masked-based architecture similar to Conv-TasNet. The encoder and decoder consist of 1D convolution and 1D transposed convolution, while the mask generator consists of a convolution and transformer architecture. The input chest sounds were first encoded as a sequence of tokens using 1D convolution. The tokens were then passed to the mask generator to generate two masks, one for heart sounds and one for lung sounds. Each mask is then applied to the input token sequence. Lastly, the tokens are converted back to waveforms using 1D transposed convolution. Our proposed model showed superior results compared to the previous methods based on objective distortion measures, ranging from a 2.01 dB improvement to a 5.06 dB improvement. The proposed model is also significantly faster than the previous methods, with at least a 17-time improvement. The proposed model could be a suitable preprocessing step for any health monitoring system where only the heart sound or lung sound is desired.
新生儿听诊是诊断心血管和呼吸系统疾病的一种简单且非侵入性的方法。然而,获取仅包含心音或肺音的高质量胸部声音并非易事。因此,本研究引入了一种名为NeoSSNet的新型深度学习模型,并将其与先前方法在新生儿胸部声音分离方面的性能进行了评估。我们提出了一种类似于Conv-TasNet的基于掩码的架构。编码器和解码器由一维卷积和一维转置卷积组成,而掩码生成器由卷积和Transformer架构组成。首先使用一维卷积将输入的胸部声音编码为一系列令牌。然后将这些令牌传递给掩码生成器以生成两个掩码,一个用于心音,一个用于肺音。然后将每个掩码应用于输入令牌序列。最后,使用一维转置卷积将令牌转换回波形。与基于客观失真度量的先前方法相比,我们提出的模型显示出更好的结果,改善范围从2.01 dB到5.06 dB。所提出的模型也比先前方法快得多,至少提高了17倍。对于任何只需要心音或肺音的健康监测系统,所提出的模型可能是一个合适的预处理步骤。