Moon Hye Jeong, Ji Hyunmin, Kim Baek Seung, Kim Beom Joon, Kim Kyunghoon
Department of Pediatrics, Seoul National University College of Medicine, Seoul, Republic of Korea.
Department of Health Science and Technology, Seoul National University, Seoul, Republic of Korea.
Front Pediatr. 2025 May 20;13:1428862. doi: 10.3389/fped.2025.1428862. eCollection 2025.
Auscultation is a critical diagnostic feature of lung diseases, but it is subjective and challenging to measure accurately. To overcome these limitations, artificial intelligence models have been developed.
In this prospective study, we aimed to compare respiratory sound feature extraction methods to develop an optimal machine learning model for detecting wheezing in children. Pediatric pulmonologists recorded and verified 103 instances of wheezing and 184 other respiratory sounds in 76 children. Various methods were used for sound feature extraction, and dimensions were reduced using t-distributed Stochastic Neighbor Embedding (t-SNE). The performance of models in wheezing detection was evaluated using a kernel support vector machine (SVM).
The duration of recordings in the wheezing and non-wheezing groups were 89.36 ± 39.51 ms and 63.09 ± 27.79 ms, respectively. The Mel-spectrogram, Mel-frequency Cepstral Coefficient (MFCC), and spectral contrast achieved the best expression of respiratory sounds and showed good performance in cluster classification. The SVM model using spectral contrast exhibited the best performance, with an accuracy, precision, recall, and F-1 score of 0.897, 0.800, 0.952, and 0.869, respectively.
Mel-spectrograms, MFCC, and spectral contrast are effective for characterizing respiratory sounds in children. A machine learning model using spectral contrast demonstrated high detection performance, indicating its potential utility in ensuring accurate diagnosis of pediatric respiratory diseases.
听诊是肺部疾病的关键诊断特征,但它具有主观性且准确测量具有挑战性。为克服这些局限性,已开发出人工智能模型。
在这项前瞻性研究中,我们旨在比较呼吸音特征提取方法,以开发一种用于检测儿童哮鸣音的最佳机器学习模型。儿科肺科医生记录并验证了76名儿童的103例哮鸣音实例和184种其他呼吸音。使用了各种方法进行声音特征提取,并使用t分布随机邻域嵌入(t-SNE)进行降维。使用核支持向量机(SVM)评估模型在哮鸣音检测中的性能。
哮鸣音组和非哮鸣音组的录音时长分别为89.36±39.51毫秒和63.09±27.79毫秒。梅尔频谱图、梅尔频率倒谱系数(MFCC)和频谱对比度实现了呼吸音的最佳表达,并在聚类分类中表现良好。使用频谱对比度的SVM模型表现最佳,其准确率、精确率、召回率和F1分数分别为0.897、0.800、0.952和0.869。
梅尔频谱图、MFCC和频谱对比度对于表征儿童呼吸音是有效的。使用频谱对比度的机器学习模型表现出较高的检测性能,表明其在确保准确诊断儿科呼吸系统疾病方面的潜在效用。