Department of Population and Quantitative Health Sciences, Case Western Reserve University School of Medicine, Cleveland, Ohio.
Kellogg School of Management, Northwestern University, Evanston, Illinois.
JAMA Netw Open. 2022 Sep 1;5(9):e2230925. doi: 10.1001/jamanetworkopen.2022.30925.
The association between cancer mortality and risk factors may vary by geography. However, conventional methodological approaches rarely account for this variation.
To identify geographic variations in the association between risk factors and cancer mortality.
DESIGN, SETTING, AND PARTICIPANTS: This geospatial cross-sectional study used county-level data from the National Center for Health Statistics for individuals who died of cancer from 2008 to 2019. Risk factor data were obtained from County Health Rankings & Roadmaps, Health Resources and Services Administration, and Centers for Disease Control and Prevention. Analyses were conducted from October 2021 to July 2022.
Conventional random forest models were applied nationwide and by US region, and the geographical random forest model (accounting for local variation of association) was applied to assess associations between a wide range of risk factors and cancer mortality.
The study included 7 179 201 individuals (median age, 70-74 years; 3 409 508 women [47.5%]) who died from cancer in 3108 contiguous US counties during 2008 to 2019. The mean (SD) county-level cancer mortality rate was 177.0 (26.4) deaths per 100 000 people. On the basis of the variable importance measure, the random forest models identified multiple risk factors associated with cancer mortality, including smoking, receipt of Supplemental Nutrition Assistance Program (SNAP) benefits, and obesity. The geographical random forest model further identified risk factors that varied at the county level. For example, receipt of SNAP benefits was a high-importance factor in the Appalachian region, North and South Dakota, and Northern California; smoking was of high importance in Kentucky and Tennessee; and female-headed households were high-importance factors in North and South Dakota. Geographic areas with certain high-importance risk factors did not consistently have a corresponding high prevalence of the same risk factors.
In this cross-sectional study, the associations between cancer mortality and risk factors varied by geography in a way that did not correspond strictly to risk factor prevalence. The degree to which other place-specific characteristics, observed and unobserved, modify risk factor effects should be further explored, and this work suggests that risk factor importance may be a preferable paradigm for selecting cancer control interventions compared with risk factor prevalence.
癌症死亡率与危险因素之间的关联可能因地理位置而异。然而,传统的方法学方法很少考虑到这种变化。
确定危险因素与癌症死亡率之间关联的地理差异。
设计、设置和参与者:本项基于空间的横断面研究使用了来自国家卫生统计中心的 2008 年至 2019 年因癌症死亡的个体的县级数据。风险因素数据来自县健康排名和路线图、卫生资源和服务管理局以及疾病控制和预防中心。分析于 2021 年 10 月至 2022 年 7 月进行。
在全国范围内和按美国地区应用常规随机森林模型,并应用地理随机森林模型(考虑关联的局部变化)来评估广泛的风险因素与癌症死亡率之间的关联。
这项研究纳入了 7179201 名(中位年龄 70-74 岁;3409508 名女性[47.5%])在 2008 年至 2019 年期间死于癌症的美国 3108 个连续县的个体。县级癌症死亡率的平均值(标准差)为每 100000 人 177.0(26.4)人死亡。基于变量重要性度量,随机森林模型确定了与癌症死亡率相关的多个危险因素,包括吸烟、接受补充营养援助计划(SNAP)福利和肥胖。地理随机森林模型进一步确定了在县级层面上存在差异的危险因素。例如,在阿巴拉契亚地区、北达科他州和南达科他州以及加利福尼亚北部,接受 SNAP 福利是一个重要的因素;在肯塔基州和田纳西州,吸烟是一个重要因素;而女性为户主的家庭在北达科他州和南达科他州是一个重要的因素。具有某些高重要性危险因素的地理区域并不总是具有相同危险因素的相应高流行率。
在这项横断面研究中,癌症死亡率与危险因素之间的关联在地理上存在差异,这种差异与风险因素的流行率并不完全一致。其他特定于地点的特征(观察到的和未观察到的)在多大程度上改变了风险因素的影响,应进一步探讨,这一研究结果表明,与风险因素流行率相比,风险因素的重要性可能是选择癌症控制干预措施的一个更优范式。