Department of Public Health Sciences, Queen's University, Kingston, ON, K7L 3N6, Canada.
Department of Pediatrics and Translational Medicine, SickKids Research Institute, The Hospital for Sick Children, Toronto, ON, Canada.
BMC Med Res Methodol. 2024 Nov 1;24(1):262. doi: 10.1186/s12874-024-02376-2.
Asthma is a heterogeneous disease that affects millions of children and adults. There is a lack of objective gold standard diagnosis that spans the ages; instead, diagnoses are made by clinician assessment based on a cluster of signs, symptoms and objective tests dependent on age. Yet, there is a clear morbidity associated with chronic asthma symptoms. Machine learning has become a popular tool to improve asthma diagnosis and classification. There is a paucity of literature on the use of Bayesian machine learning algorithms to predict asthma diagnosis in children. This paper develops a prediction model using the Bayesian additive regression trees (BART) and compares its performance to various machine learning algorithms in predicting the diagnosis of childhood asthma.
Clinically relevant variables collected at or before 3 years of age from 2794 participants in the CHILD Cohort Study were used to predict physician-diagnosed asthma at age 5. BART and six other commonly used machine learning algorithms, namely adaptive boosting, logistic regression, decision tree, neural network, random forest, and support vector machine were trained. Measures of performance including sensitivity, specificity, and area under the receiver operating characteristic (ROC) curve were calculated. The confidence intervals were calculated using Bootstrapping samples. Important predictors and interaction effects associated with asthma were also identified using BART.
BART, logistic regression and random forest showed the highest area under the ROC curve compared to other machine learning algorithms. Based on BART, recurrent wheeze, respiratory infection and food sensitization at 3 years of age were the most important predictors. The three most important interaction effects were found to be interaction terms of respiratory infection at 3 years and recurrent wheezing at 3 years, maternal asthma and paternal asthma, and maternal wheezing and inhalant sensitization of child at 3 years.
BART demonstrated promising prediction performance when compared to other machine learning algorithms. Future research could validate the BART in an external cohort to evaluate its reliability and generalizability.
哮喘是一种影响数百万儿童和成人的异质性疾病。目前缺乏跨越年龄的客观金标准诊断方法;相反,诊断是根据基于年龄的一系列体征、症状和客观测试,由临床医生根据临床评估做出的。然而,慢性哮喘症状与明显的发病率相关。机器学习已成为改善哮喘诊断和分类的流行工具。关于使用贝叶斯机器学习算法预测儿童哮喘诊断的文献很少。本文使用贝叶斯加法回归树(BART)开发了一个预测模型,并将其性能与各种机器学习算法在预测儿童哮喘诊断方面的性能进行了比较。
使用来自 CHILD 队列研究的 2794 名参与者在 3 岁或 3 岁前收集的临床相关变量来预测 5 岁时医生诊断的哮喘。训练 BART 和其他六种常用的机器学习算法,包括自适应增强、逻辑回归、决策树、神经网络、随机森林和支持向量机。使用 Bootstrapping 样本计算了性能指标,包括敏感性、特异性和接收者操作特征曲线下的面积(ROC 曲线)。基于 BART,3 岁时反复喘息、呼吸道感染和食物致敏是最重要的预测因素。还确定了与哮喘相关的重要预测因子和交互作用。
与其他机器学习算法相比,BART 显示出有前景的预测性能。未来的研究可以在外部队列中验证 BART,以评估其可靠性和通用性。