Institute for Health Policy, Sri Lanka and Robert Gordon University, UK.
Department of Statistics, University of Colombo, Sri Lanka.
Health Informatics J. 2024 Jul-Sep;30(3):14604582241283968. doi: 10.1177/14604582241283968.
Addressing the challenge of cost-effective asthma diagnosis amidst diverse symptom patterns among patients, this study aims to develop a machine learning-based asthma prediction tool for self-detection of asthma. Data from 6,665 participants in the Sri Lanka Health and Ageing Study (2018-2019) are used for this research. Thirteen machine learning algorithms, including Logistic Regression, Support Vector Machine, Decision Tree, Random Forest, Naïve Bayes, K-Nearest Neighbors, Gradient Boost, XGBoost, AdaBoost, CatBoost, LightGBM, Multi-Layer Perceptron, and Probabilistic Neural Network, are employed. A hybrid version of Logistic Regression and LightGBM outperformed other models, achieving an AUC of 0.9062 and 79.85% sensitivity. Key predictive features for asthma include wheezing, breathlessness with wheezing, shortness of breath attacks, coughing attacks, chest tightness, nasal allergies, physical activity, passive smoking, ethnicity, and residential sector. Combining Logistic Regression and LightGBM models can effectively predict adult asthma based on self-reported symptoms and demographic and behavioural characteristics. The proposed expert system assists clinicians and patients in diagnosing potential asthma cases.
针对患者不同症状模式下具有成本效益的哮喘诊断这一挑战,本研究旨在开发一种基于机器学习的哮喘自我检测哮喘预测工具。本研究使用了来自斯里兰卡健康与老龄化研究(2018-2019 年)的 6665 名参与者的数据。研究采用了 13 种机器学习算法,包括逻辑回归、支持向量机、决策树、随机森林、朴素贝叶斯、K-最近邻、梯度提升、XGBoost、AdaBoost、CatBoost、LightGBM、多层感知机和概率神经网络。逻辑回归和 LightGBM 的混合版本表现优于其他模型,AUC 为 0.9062,灵敏度为 79.85%。哮喘的关键预测特征包括喘息、伴有喘息的呼吸困难、呼吸急促发作、咳嗽发作、胸闷、鼻过敏、体力活动、被动吸烟、种族和居住区域。结合逻辑回归和 LightGBM 模型可以根据自我报告的症状以及人口统计学和行为特征有效预测成人哮喘。该专家系统有助于临床医生和患者诊断潜在的哮喘病例。