Department of Electrical and Electronic Engineering, Stellenbosch University, South Africa.
SAMRC Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, DSI/NRF Centre of Excellence for Biomedical Tuberculosis Research, Faculty of Medicine and Health Sciences, Stellenbosch University, South Africa.
Physiol Meas. 2021 Nov 26;42(10). doi: 10.1088/1361-6579/ac2fb8.
The automatic discrimination between the coughing sounds produced by patients with tuberculosis (TB) and those produced by patients with other lung ailments.We present experiments based on a dataset of 1358 forced cough recordings obtained in a developing-world clinic from 16 patients with confirmed active pulmonary TB and 35 patients suffering from respiratory conditions suggestive of TB but confirmed to be TB negative. Using nested cross-validation, we have trained and evaluated five machine learning classifiers: logistic regression (LR), support vector machines, k-nearest neighbour, multilayer perceptrons and convolutional neural networks.Although classification is possible in all cases, the best performance is achieved using LR. In combination with feature selection by sequential forward selection, our best LR system achieves an area under the ROC curve (AUC) of 0.94 using 23 features selected from a set of 78 high-resolution mel-frequency cepstral coefficients. This system achieves a sensitivity of 93% at a specificity of 95% and thus exceeds the 90% sensitivity at 70% specificity specification considered by the World Health Organisation (WHO) as a minimal requirement for a community-based TB triage test.The automatic classification of cough audio sounds, when applied to symptomatic patients requiring investigation for TB, can meet the WHO triage specifications for the identification of patients who should undergo expensive molecular downstream testing. This makes it a promising and viable means of low cost, easily deployable frontline screening for TB, which can benefit especially developing countries with a heavy TB burden.
肺结核(TB)患者和其他肺部疾病患者咳嗽声音的自动区分。我们展示了基于来自发展中国家诊所的 16 名确诊活动性肺结核患者和 35 名疑似肺结核但经证实为肺结核阴性的呼吸疾病患者的 1358 次强制咳嗽记录数据集的实验。使用嵌套交叉验证,我们训练和评估了五个机器学习分类器:逻辑回归(LR)、支持向量机、k-最近邻、多层感知器和卷积神经网络。尽管在所有情况下都可以进行分类,但使用 LR 可以获得最佳性能。与通过顺序前向选择进行特征选择相结合,我们使用从一组 78 个高分辨率梅尔频率倒谱系数中选择的 23 个特征的最佳 LR 系统,在 ROC 曲线下面积(AUC)为 0.94。该系统在特异性为 95%时的灵敏度为 93%,因此超过了世界卫生组织(WHO)规定的 90%灵敏度和 70%特异性的要求,这被认为是社区为基础的结核病分诊测试的最低要求。当将咳嗽音频声音的自动分类应用于需要进行结核病检查的有症状患者时,可以满足 WHO 对识别需要进行昂贵的分子下游检测的患者的分诊规范。这使其成为一种有前途和可行的低成本、易于部署的结核病一线筛查手段,特别受益于结核病负担沉重的发展中国家。