Center for Climate Change Adaptation, National Institute for Environmental Studies, Tsukuba, Ibaraki, Japan.
Center for Climate Change Adaptation, National Institute for Environmental Studies, Tsukuba, Ibaraki, Japan.
Sci Total Environ. 2022 Nov 1;845:157312. doi: 10.1016/j.scitotenv.2022.157312. Epub 2022 Jul 13.
Environmental factors have been associated with adverse health effects in epidemiological studies. The main exposure variable is usually determined via prior knowledge or statistical methods. It may be challenging when evidence is scarce to support prior knowledge, or to address collinearity issues using statistical methods. This study aimed to investigate the importance level of environmental variables for the under-five mortality in Malaysia via random forest approach.
We applied a conditional permutation importance via a random forest (CPI-RF) approach to evaluate the relative importance of the weather- and air pollution-related environmental factors on daily under-five mortality in Malaysia. This study spanned from January 1, 2014 to December 31, 2016. In data preparation, deviation mortality counts were derived through a generalized additive model, adjusting for long-term trend and seasonality. Analyses were conducted considering mortality causes (all-cause, natural-cause, or external-cause) and data structures (continuous, categorical, or all types [i.e., include all variables of continuous type and all variables of categorical type]). The main analysis comprised of two stages. In Stage 1, Boruta selection was applied for preliminary screening to remove highly unimportant variables. In Stage 2, the retained variables from Boruta were used in the CPI-RF analysis. The final importance value was obtained as an average value from a 10-fold cross-validation.
Some heat-related variables (maximum temperature, heat wave), temperature variability, and haze-related variables (PM10, PM10-derived haze index, PM10- and fire-derived haze index, fire hotspot) were among the prominent variables associated with under-five mortality in Malaysia. The important variables were consistent for all- and natural-cause mortality and sensitivity analyses. However, different most important variables were observed between natural- and external-cause under-five mortality.
Heat-related variables, temperature variability, and haze-related variables were consistently prominent for all- and natural-cause under-five mortalities, but not for external-cause.
环境因素已在流行病学研究中与不良健康影响相关联。主要暴露变量通常通过先验知识或统计方法来确定。当缺乏支持先验知识的证据,或使用统计方法解决共线性问题时,可能会面临挑战。本研究旨在通过随机森林方法探讨马来西亚五岁以下儿童死亡的环境变量的重要性水平。
我们应用随机森林的条件排列重要性(CPI-RF)方法来评估与天气和空气污染相关的环境因素对马来西亚五岁以下儿童每日死亡率的相对重要性。本研究时间跨度为 2014 年 1 月 1 日至 2016 年 12 月 31 日。在数据准备过程中,通过广义加性模型得出偏差死亡率计数,以调整长期趋势和季节性。考虑到死亡率原因(全因、自然原因或外部原因)和数据结构(连续型、分类型或所有类型[即包括所有连续型变量和所有分类型变量])进行分析。主要分析包括两个阶段。在阶段 1 中,应用 Boruta 选择进行初步筛选,以去除高度不重要的变量。在阶段 2 中,Boruta 保留的变量用于 CPI-RF 分析。最终的重要值是通过 10 倍交叉验证获得的平均值。
一些与热相关的变量(最高温度、热浪)、温度变异性和与雾霾相关的变量(PM10、PM10 衍生的雾霾指数、PM10 和火灾衍生的雾霾指数、火灾热点)是与马来西亚五岁以下儿童死亡相关的突出变量。重要变量在全因和自然原因死亡率以及敏感性分析中是一致的。然而,在自然原因和外部原因五岁以下儿童死亡率之间观察到不同的最重要变量。
与热相关的变量、温度变异性和与雾霾相关的变量对于全因和自然原因五岁以下儿童死亡率是一致的重要因素,但对于外部原因死亡率则不然。