Division of Pulmonary and Critical Care Medicine, Department of Internal Medicine, College of Medicine, Uijeongbu St. Mary's Hospital, The Catholic University of Korea, Seoul, Republic of Korea.
Departement of Applied Statistics, Yonsei University, Seoul, Republic of Korea.
BMC Pulm Med. 2023 Jun 6;23(1):196. doi: 10.1186/s12890-023-02479-4.
Analysis of the National Health Insurance data has been actively carried out for the purpose of academic research and establishing scientific evidences for health care service policy in asthma. However, there has been a limitation for the accuracy of the data extracted through conventional operational definition. In this study, we verified the accuracy of conventional operational definition of asthma, by applying it to a real hospital setting. And by using a machine learning technique, we established an appropriate operational definition that predicts asthma more accurately.
We extracted asthma patients using the conventional operational definition of asthma at Seoul St. Mary's hospital and St. Paul's hospital at the Catholic University of Korea between January 2017 and January 2018. Among these extracted patients of asthma, 10% of patients were randomly sampled. We verified the accuracy of the conventional operational definition for asthma by matching actual diagnosis through medical chart review. And then we operated machine learning approaches to predict asthma more accurately.
A total of 4,235 patients with asthma were identified using a conventional asthma definition during the study period. Of these, 353 patients were collected. The patients of asthma were 56% of study population, 44% of patients were not asthma. The use of machine learning techniques improved the overall accuracy. The XGBoost prediction model for asthma diagnosis showed an accuracy of 87.1%, an AUC of 93.0%, sensitivity of 82.5%, and specificity of 97.9%. Major explanatory variable were ICS/LABA,LAMA and LTRA for proper diagnosis of asthma.
The conventional operational definition of asthma has limitation to extract true asthma patients in real world. Therefore, it is necessary to establish an accurate standardized operational definition of asthma. In this study, machine learning approach could be a good option for building a relevant operational definition in research using claims data.
为了学术研究和为医疗保健服务政策制定科学依据,一直在积极对国家健康保险数据进行分析。然而,通过传统操作定义提取的数据准确性存在一定的局限性。在这项研究中,我们将传统的哮喘操作定义应用于真实的医院环境,以验证其准确性。并且通过使用机器学习技术,我们建立了一个更准确预测哮喘的合适操作定义。
我们使用韩国天主教大学首尔圣玛丽医院和圣保罗医院在 2017 年 1 月至 2018 年 1 月期间的常规哮喘操作定义来提取哮喘患者。在这些提取的哮喘患者中,随机抽取了 10%的患者。我们通过病历回顾来匹配实际诊断,验证了常规哮喘操作定义的准确性。然后,我们采用机器学习方法来更准确地预测哮喘。
在研究期间,使用常规哮喘定义共确定了 4235 例哮喘患者。其中收集了 353 例患者。哮喘患者占研究人群的 56%,44%的患者不是哮喘。使用机器学习技术提高了整体准确性。XGBoost 哮喘诊断预测模型的准确性为 87.1%,AUC 为 93.0%,敏感性为 82.5%,特异性为 97.9%。适当诊断哮喘的主要解释变量为 ICS/LABA、LAMA 和 LTRA。
传统的哮喘操作定义在提取真实世界中的哮喘患者方面存在局限性。因此,有必要建立一个准确的标准化哮喘操作定义。在这项研究中,机器学习方法可以成为使用索赔数据进行研究的相关操作定义的一个很好的选择。