Department of Computer Science, National Yang Ming Chiao Tung University, Hsinchu, Taiwan.
National Institute of Environmental Health Sciences, National Health Research Institutes, Miaoli, Taiwan.
Sci Total Environ. 2021 Jul 10;777:145982. doi: 10.1016/j.scitotenv.2021.145982. Epub 2021 Feb 20.
The incidence of childhood atopic dermatitis (AD) and allergic rhinitis (AR) is increasing. This warrants development of measures to predict and prevent these conditions. We aimed to investigate the predictive ability of a spectrum of data mining methods to predict childhood AD and AR using longitudinal birth cohort data. We conducted a 14-year follow-up of infants born to pregnant women who had undergone maternal examinations at nine selected maternity hospitals across Taiwan during 2000-2005. The subjects were interviewed using structured questionnaires to record data on basic demographics, socioeconomic status, lifestyle, medical history, and 24-h dietary recall. Hourly concentrations of air pollutants within 1 year before childbirth were obtained from 76 national air quality monitoring stations in Taiwan. We utilized weighted K-nearest neighbour method (k = 3) to infer the personalized air pollution exposure. Machine learning methods were performed on the heterogeneous attributes set to predict allergic diseases in children. A total of 1439 mother-infant pairs were recruited in machine learning analysis. The prevalence of AD and AR in children up to 14 years of age were 6.8% and 15.9%, respectively. Overall, tree-based models achieved higher sensitivity and specificity than other methods, with areas under receiver operating characteristic curve of 83% for AD and 84% for AR, respectively. Our findings confirmed that prenatal air quality is an important factor affecting the predictive ability. Moreover, different air quality indices were better predicted, in combination than separately. Combining heterogeneous attributes including environmental exposures, demographic information, and allergens is the key to a better prediction of children allergies in the general population. Prenatal exposure to nitrogen dioxide (NO) and its concatenation changes with time were significant predictors for AD and AR till adolescent.
儿童特应性皮炎(AD)和过敏性鼻炎(AR)的发病率正在上升。这就需要制定措施来预测和预防这些疾病。我们旨在调查一系列数据挖掘方法的预测能力,使用纵向出生队列数据预测儿童 AD 和 AR。我们对 2000-2005 年期间在台湾 9 家选定的妇产科医院接受产前检查的孕妇所生的婴儿进行了 14 年的随访。通过结构问卷调查对这些婴儿进行访谈,记录基本人口统计学、社会经济状况、生活方式、病史和 24 小时饮食回忆等数据。在台湾的 76 个国家空气质量监测站获取了分娩前 1 年内每小时的空气污染物浓度。我们利用加权 K-最近邻法(k=3)推断个人化的空气污染暴露情况。利用机器学习方法对异质属性集进行分析,以预测儿童过敏疾病。共有 1439 对母婴被纳入机器学习分析。在 14 岁以下的儿童中,AD 和 AR 的患病率分别为 6.8%和 15.9%。总的来说,基于树的模型比其他方法具有更高的敏感性和特异性,AD 和 AR 的受试者工作特征曲线下面积分别为 83%和 84%。我们的研究结果证实,产前空气质量是影响预测能力的一个重要因素。此外,与单独使用相比,将不同的空气质量指数结合起来使用可以得到更好的预测结果。将环境暴露、人口统计学信息和过敏原等异质属性结合起来是对普通人群中儿童过敏进行更好预测的关键。产前暴露于二氧化氮(NO)及其随时间变化的串联物是 AD 和 AR 直至青少年期的重要预测因子。