Lella Kranthi Kumar, Jagadeesh M S, Alphonse P J A
School of Computer Science and Engineering, VIT-AP University, Vijayawada, Guntur, Andhra Pradesh 522237 India.
Department of Computer Applications, NIT Tiruchirappalli, Tiruchirappalli, Guntur, Tamil Nadu 620015 India.
Health Inf Sci Syst. 2024 Mar 9;12(1):22. doi: 10.1007/s13755-024-00283-w. eCollection 2024 Dec.
The utilization of lung sounds to diagnose lung diseases using respiratory sound features has significantly increased in the past few years. The Digital Stethoscope data has been examined extensively by medical researchers and technical scientists to diagnose the symptoms of respiratory diseases. Artificial intelligence-based approaches are applied in the real universe to distinguish respiratory disease signs from human pulmonary auscultation sounds. The Deep CNN model is implemented with combined multi-feature channels (Modified MFCC, Log Mel, and Soft Mel) to obtain the sound parameters from lung-based Digital Stethoscope data. The model analysis is observed with max-pooling and without max-pool operations using multi-feature channels on respiratory digital stethoscope data. In addition, COVID-19 sound data and enriched data, which are recently acquired data to enhance model performance using a combination of L2 regularization to overcome the risk of overfitting because of less respiratory sound data, are included in the work. The suggested DCNN with Max-Pooling on the improved dataset demonstrates cutting-edge performance employing a multi-feature channels spectrogram. The model has been developed with different convolutional filter sizes (, , , , and ) that helped to test the proposed neural network. According to the experimental findings, the suggested DCNN architecture with a max-pooling function performs better to identify respiratory disease symptoms than DCNN without max-pooling. In order to demonstrate the model's effectiveness in categorization, it is trained and tested with the DCNN model that extract several modalities of respiratory sound data.
在过去几年中,利用呼吸音特征通过肺部声音诊断肺部疾病的应用显著增加。医学研究人员和技术科学家对数字听诊器数据进行了广泛研究,以诊断呼吸系统疾病的症状。基于人工智能的方法被应用于现实世界,以从人类肺部听诊声音中区分出呼吸系统疾病的迹象。深度卷积神经网络(Deep CNN)模型通过组合多特征通道(改进的梅尔频率倒谱系数(Modified MFCC)、对数梅尔频谱(Log Mel)和软梅尔频谱(Soft Mel))来从基于肺部的数字听诊器数据中获取声音参数。在呼吸数字听诊器数据上,使用多特征通道对模型分析进行了最大池化观察和无最大池化操作的观察。此外,工作中还纳入了新冠病毒疾病(COVID-19)声音数据和丰富数据,这些是最近获取的数据,通过结合L2正则化来提高模型性能,以克服由于呼吸声音数据较少而导致的过拟合风险。在改进数据集上带有最大池化的建议深度卷积神经网络(DCNN)展示了采用多特征通道频谱图的前沿性能。该模型采用了不同的卷积滤波器大小( 、 、 、 和 )进行开发,这有助于测试所提出的神经网络。根据实验结果,带有最大池化功能的建议DCNN架构在识别呼吸系统疾病症状方面比没有最大池化的DCNN表现更好。为了证明该模型在分类方面的有效性,使用提取多种呼吸声音数据模态的DCNN模型对其进行了训练和测试。