IEEE J Biomed Health Inform. 2022 Jun;26(6):2746-2757. doi: 10.1109/JBHI.2022.3152944. Epub 2022 Jun 3.
Cough, a symptom associated with many prevalent respiratory diseases, can serve as a potential biomarker for diagnosis and disease progression. Consequently, the development of cough monitoring systems and, in particular, automatic cough detection algorithms have been studied since the early 2000s. Recently, there has been an increased focus on the efficiency of such algorithms, as implementation on consumer-centric devices such as smartphones would provide a scalable and affordable solution for monitoring cough with contact-free sensors. Current algorithms, however, are incapable of discerning between coughs of different individuals and, thus, cannot function reliably in situations where potentially multiple individuals have to be monitored in shared environments. Therefore, we propose a weakly supervised metric learning approach for cougher recognition based on smartphone audio recordings of coughs. Our approach involves a triplet network architecture, which employs convolutional neural networks (CNNs). The CNNs of the triplet network learn an embedding function, which maps Mel spectrograms of cough recordings to an embedding space where they are more easily distinguishable. Using audio recordings of nocturnal coughs from asthmatic patients captured with a smartphone, our approach achieved a mean accuracyof 88 % ( ± 10 % SD) on two-way identification tests with 12 enrollment samples and accuracy of 80 % and an equal error rate (EER) of 20 % on verification tests. Furthermore, our approach outperformed human raters with regard to verification tests on average by 8% in accuracy, 4% in false acceptance rate (FAR), and 12% in false rejection rate (FRR). Our code and models are publicly available.
咳嗽是许多常见呼吸道疾病的症状,可以作为诊断和疾病进展的潜在生物标志物。因此,自 21 世纪初以来,人们一直在研究咳嗽监测系统的开发,特别是自动咳嗽检测算法。最近,人们越来越关注这些算法的效率,因为在智能手机等以消费者为中心的设备上实施,将为使用非接触式传感器进行咳嗽监测提供一种可扩展且经济实惠的解决方案。然而,目前的算法无法区分不同个体的咳嗽,因此无法在需要在共享环境中监测多个潜在个体的情况下可靠地运行。因此,我们提出了一种基于智能手机咳嗽音频记录的弱监督度量学习方法,用于识别咳嗽者。我们的方法涉及一个三重网络架构,该架构采用卷积神经网络 (CNN)。三重网络的 CNN 学习一个嵌入函数,该函数将咳嗽记录的梅尔频谱图映射到一个嵌入空间,在该空间中它们更容易区分。使用智能手机捕获的哮喘患者夜间咳嗽的音频记录,我们的方法在使用 12 个登记样本进行的双向识别测试中实现了 88%(±10%SD)的平均准确率,在验证测试中的准确率为 80%,误报率 (FAR) 为 20%。此外,我们的方法在验证测试中平均比人类评估者的准确率高 8%,误接受率 (FAR) 低 4%,误拒绝率 (FRR) 低 12%。我们的代码和模型是公开的。