Choi Joon Young, Rhee Chin Kook
Department of Internal Medicine, Incheon St. Mary's Hospital, College of Medicine, The Catholic University of Korea, Seoul, Republic of Korea.
Department of Internal Medicine, Seoul St. Mary's Hospital, College of Medicine, The Catholic University of Korea, Seoul, Republic of Korea.
J Asthma Allergy. 2024 Aug 13;17:783-789. doi: 10.2147/JAA.S471964. eCollection 2024.
Asthma is a chronic inflammatory airway disease with significant burden; exacerbations can severely affect quality of life and healthcare costs. Advances in big data analysis and artificial intelligence have made it easier to predict future exacerbations more accurately. This study used an integrated dataset of Korean National Health Insurance, meteorological, air pollution, and viral data from national public databases to develop a model to predict asthma exacerbations on a daily basis in South Korea. We merged these sources and applied random forest, AdaBoost, XGBoost, and LightGBM machine learning models to compare their performances at predicting future exacerbations. Of the models, XGBoost (AUROC of 0.68 and accuracy of 0.96) and LightGBM (AUROC of 0.67 and accuracy of 0.96) were the most promising. Common important variables were the number of visits and exacerbations per year, and medical resource utilization, including the prescription of asthma medications. Comorbid diabetes, hypertension, gastroesophageal reflux, arthritis, metabolic syndrome, osteoporosis, and ischemic heart disease were also associated with elevated exacerbation risk. The models examined in this study highlight the importance of previous exacerbations, use of medical resources, and comorbidities in the prediction of future exacerbations in patients with asthma.
哮喘是一种负担沉重的慢性炎症性气道疾病;病情加重会严重影响生活质量和医疗费用。大数据分析和人工智能的进展使更准确地预测未来病情加重变得更加容易。本研究使用了来自韩国国家健康保险、气象、空气污染和国家公共数据库中的病毒数据的综合数据集,以开发一个模型来预测韩国每日的哮喘病情加重情况。我们合并了这些数据源,并应用随机森林、自适应增强(AdaBoost)、极端梯度提升(XGBoost)和轻量级梯度提升机(LightGBM)机器学习模型来比较它们在预测未来病情加重方面的性能。在这些模型中,XGBoost(曲线下面积为0.68,准确率为0.96)和LightGBM(曲线下面积为0.67,准确率为0.96)最具前景。常见的重要变量是每年的就诊次数和病情加重次数,以及医疗资源利用情况,包括哮喘药物的处方。合并糖尿病、高血压、胃食管反流、关节炎、代谢综合征、骨质疏松症和缺血性心脏病也与病情加重风险升高有关。本研究中检验的模型突出了既往病情加重、医疗资源使用情况和合并症在预测哮喘患者未来病情加重方面的重要性。