University at Buffalo, State University at New York, Buffalo, USA.
Research and Education in Energy, Environment, and Water (RENEW) Institute, University at Buffalo, State University at New York, 108 Cooke Hall, Buffalo, NY, 14260, USA.
Sci Rep. 2021 Mar 26;11(1):6955. doi: 10.1038/s41598-021-85381-5.
Type 2 diabetes mellitus (T2D) prevalence in the United States varies substantially across spatial and temporal scales, attributable to variations of socioeconomic and lifestyle risk factors. Understanding these variations in risk factors contributions to T2D would be of great benefit to intervention and treatment approaches to reduce or prevent T2D. Geographically-weighted random forest (GW-RF), a tree-based non-parametric machine learning model, may help explore and visualize the relationships between T2D and risk factors at the county-level. GW-RF outputs are compared to global (RF and OLS) and local (GW-OLS) models between the years of 2013-2017 using low education, poverty, obesity, physical inactivity, access to exercise, and food environment as inputs. Our results indicate that a non-parametric GW-RF model shows a high potential for explaining spatial heterogeneity of, and predicting, T2D prevalence over traditional local and global models when inputting six major risk factors. Some of these predictions, however, are marginal. These findings of spatial heterogeneity using GW-RF demonstrate the need to consider local factors in prevention approaches. Spatial analysis of T2D and associated risk factor prevalence offers useful information for targeting the geographic area for prevention and disease interventions.
2 型糖尿病(T2D)在美国的流行程度在空间和时间尺度上存在很大差异,这归因于社会经济和生活方式风险因素的变化。了解这些风险因素对 T2D 的贡献变化将极大地有益于干预和治疗方法,以减少或预防 T2D。基于树的非参数机器学习模型——地理加权随机森林(GW-RF),可以帮助探索和可视化县一级 T2D 与风险因素之间的关系。使用低教育水平、贫困、肥胖、身体活动不足、运动机会和食物环境作为输入,GW-RF 输出结果与 2013-2017 年期间的全局(RF 和 OLS)和局部(GW-OLS)模型进行了比较。我们的研究结果表明,在输入六个主要风险因素时,与传统的局部和全局模型相比,非参数 GW-RF 模型在解释 T2D 流行率的空间异质性和预测方面具有很高的潜力。然而,其中一些预测是边缘的。使用 GW-RF 进行的空间异质性研究结果表明,在预防方法中需要考虑局部因素。T2D 及相关风险因素流行率的空间分析为针对预防和疾病干预的地理区域提供了有用信息。