Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, United States of America.
Center for Pediatric Clinical Effectiveness, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, United States of America.
PLoS One. 2021 Mar 1;16(3):e0247784. doi: 10.1371/journal.pone.0247784. eCollection 2021.
Early childhood asthma diagnosis is common; however, many children diagnosed before age 5 experience symptom resolution and it remains difficult to identify individuals whose symptoms will persist. Our objective was to develop machine learning models to identify which individuals diagnosed with asthma before age 5 continue to experience asthma-related visits. We curated a retrospective dataset for 9,934 children derived from electronic health record (EHR) data. We trained five machine learning models to differentiate individuals without subsequent asthma-related visits (transient diagnosis) from those with asthma-related visits between ages 5 and 10 (persistent diagnosis) given clinical information up to age 5 years. Based on average NPV-Specificity area (ANSA), all models performed significantly better than random chance, with XGBoost obtaining the best performance (0.43 mean ANSA). Feature importance analysis indicated age of last asthma diagnosis under 5 years, total number of asthma related visits, self-identified black race, allergic rhinitis, and eczema as important features. Although our models appear to perform well, a lack of prior models utilizing a large number of features to predict individual persistence makes direct comparison infeasible. However, feature importance analysis indicates our models are consistent with prior research indicating diagnosis age and prior health service utilization as important predictors of persistent asthma. We therefore find that machine learning models can predict which individuals will experience persistent asthma with good performance and may be useful to guide clinician and parental decisions regarding asthma counselling in early childhood.
儿童哮喘的早期诊断较为常见;然而,许多在 5 岁前被诊断出哮喘的儿童的症状会得到缓解,并且仍然难以确定哪些患者的症状会持续存在。我们的目的是开发机器学习模型来识别那些在 5 岁前被诊断为哮喘的患者中哪些人会持续出现哮喘相关就诊。我们从电子病历(EHR)数据中提取了一个包含 9934 名儿童的回顾性数据集。我们训练了 5 种机器学习模型,以根据 5 岁之前的临床信息,将无后续哮喘相关就诊(一过性诊断)的患者与 5 至 10 岁之间有哮喘相关就诊(持续性诊断)的患者区分开来。基于平均 NPV-Specificity 面积(ANSA),所有模型的表现均明显优于随机机会,其中 XGBoost 获得了最佳性能(0.43 平均 ANSA)。特征重要性分析表明,5 岁以下最后一次哮喘诊断的年龄、哮喘相关就诊的总次数、自我认定的黑人种族、过敏性鼻炎和湿疹是重要特征。尽管我们的模型表现良好,但缺乏利用大量特征预测个体持续性的先前模型使得直接比较变得不可行。然而,特征重要性分析表明,我们的模型与先前的研究一致,表明诊断年龄和先前的医疗服务利用是持续性哮喘的重要预测因素。因此,我们发现机器学习模型可以很好地预测哪些患者会出现持续性哮喘,并且可能有助于指导临床医生和家长在儿童早期就哮喘咨询做出决策。