Tena Alberto, Juez-Garcia Ivan, Benítez Iván D, Clariá Francesc, González Jessica, de Batlle Jordi, Solsona Francesc
Department of Computer Science and Digital Design, University of Lleida, Lleida, 25001, Spain.
Group of Translational Research in Respiratory Medicine, IRBLleida, Hospital Universitari Arnau de Vilanova i Santa Maria, Lleida, 25198, Spain.
JAMIA Open. 2025 Aug 4;8(4):ooaf083. doi: 10.1093/jamiaopen/ooaf083. eCollection 2025 Aug.
Chronic obstructive pulmonary disease (COPD) is the third leading cause of death worldwide, with up to 70% of cases remaining undiagnosed. This paper proposes a COPD screening tool based on time-frequency representation features of self-recorded respiratory sounds.
Respiratory sound samples (breath and cough sounds) were extracted from COPD and asymptomatic non-COPD volunteers using a large, scientific-purpose database. We analyzed 39 time-frequency representation features of breath and cough sounds, combined with age, sex, and smoking status, using Autoencoder neural networks and random forest (RF) algorithms. We compared the performance of different breath and cough RF models built to detect COPD: one based exclusively on sound features, one based exclusively on sociodemographic characteristics, and one based on sound features and sociodemographic characteristics.
Models including breathing features outperformed models exclusively based on sociodemographic characteristics. Specifically, the model combining sociodemographic characteristics and breathing features achieved an area under the curve (AUC), accuracy, sensitivity, and specificity of 0.901, 0.836, 0.871, and 0.761, respectively, in the test set, representing a substantial increase in AUC when compared to the model based exclusively on sociodemographic characteristics (0.901 vs 0.818).
Our results suggest that a lightweight collection of the time-frequency representation features of self-recorded beathing sounds could effectively improve the predictive performance of COPD screening or case-finding questionnaires.
COPD screening through self-recorded breathing sounds could be easily integrated as a low-cost first step in case-finding programs, potentially contributing to mitigate COPD underdiagnosis.
慢性阻塞性肺疾病(COPD)是全球第三大致死原因,高达70%的病例仍未被诊断出来。本文提出了一种基于自我记录呼吸音的时频特征的COPD筛查工具。
使用一个大型的科学用途数据库,从COPD患者和无症状的非COPD志愿者中提取呼吸音样本(呼吸声和咳嗽声)。我们使用自动编码器神经网络和随机森林(RF)算法,分析了呼吸声和咳嗽声的39个时频特征,并结合年龄、性别和吸烟状况。我们比较了为检测COPD而构建的不同呼吸和咳嗽RF模型的性能:一个仅基于声音特征,一个仅基于社会人口学特征,另一个基于声音特征和社会人口学特征。
包含呼吸特征的模型优于仅基于社会人口学特征的模型。具体而言,结合社会人口学特征和呼吸特征的模型在测试集中的曲线下面积(AUC)、准确率、敏感性和特异性分别为0.901、0.836、0.871和0.761,与仅基于社会人口学特征的模型相比,AUC有显著提高(0.901对0.818)。
我们的结果表明,自我记录呼吸声的时频特征的轻量级收集可以有效地提高COPD筛查或病例发现问卷的预测性能。
通过自我记录呼吸声进行COPD筛查可以很容易地作为病例发现计划中的低成本第一步,有可能有助于减少COPD的漏诊。