Liu Lingbo, Cowan Lauren, Wang Fahui, Onega Tracy
Center for Geographic Analysis, Harvard University, MA, 02138, USA.
Department of Population Health Sciences, University of Utah, Huntsman Cancer Institute, Salt Lake City, UT, 84112, USA.
Health Place. 2025 Jan;91:103411. doi: 10.1016/j.healthplace.2024.103411. Epub 2025 Jan 6.
This study employs an innovative multi-constraint Monte Carlo simulation method to estimate suppressed county-level cancer counts for population subgroups and extend the downscaling from county to ZIP Code Tabulation Areas (ZCTA) in the U.S. Given the known cancer counts at a higher geographic level and larger demographic groups at the same geographic level as constraints, this method uses the population structure as probability in the Monte Carlo simulation process to estimate suppressed data entries. It not only ensures consistency across various data levels but also accounts for demographic structure that drives varying cancer risks. The 2016-2020 cancer incidence data from the Utah Cancer Registry is used to validate our approach. The method yields results with high precision and consistency across the full urban-rural continuum, and significantly outperforms several machine-learning models such as Random Forest and Extreme Gradient Boosting.
本研究采用一种创新的多约束蒙特卡罗模拟方法,来估计人口亚组中被抑制的县级癌症病例数,并将降尺度分析从美国的县扩展到邮政编码分区(ZCTA)。鉴于在较高地理层面已知的癌症病例数以及同一地理层面较大人口群体作为约束条件,该方法在蒙特卡罗模拟过程中使用人口结构作为概率来估计被抑制的数据条目。它不仅确保了不同数据层面之间的一致性,还考虑了导致癌症风险各异的人口结构。利用犹他州癌症登记处2016 - 2020年的癌症发病率数据来验证我们的方法。该方法在整个城乡连续体中产生了高精度和一致性的结果,并且显著优于随机森林和极端梯度提升等几种机器学习模型。