Division of Violence Prevention, National Center for Injury Prevention and Control, Centers for Disease Control and Prevention, Atlanta, Georgia.
Division of Injury Prevention, National Center for Injury Prevention and Control, Centers for Disease Control and Prevention, Atlanta, Georgia.
JAMA Netw Open. 2023 Mar 1;6(3):e233413. doi: 10.1001/jamanetworkopen.2023.3413.
Firearm homicides are a major public health concern; lack of timely mortality data presents considerable challenges to effective response. Near real-time data sources offer potential for more timely estimation of firearm homicides.
To estimate near real-time burden of weekly and annual firearm homicides in the US.
DESIGN, SETTING, AND PARTICIPANTS: In this prognostic study, anonymous, longitudinal time series data were obtained from multiple data sources, including Google and YouTube search trends related to firearms (2014-2019), emergency department visits for firearm injuries (National Syndromic Surveillance Program, 2014-2019), emergency medical service activations for firearm-related injuries (biospatial, 2014-2019), and National Domestic Violence Hotline contacts flagged with the keyword firearm (2016-2019). Data analysis was performed from September 2021 to September 2022.
Weekly estimates of US firearm homicides were calculated using a 2-phase pipeline, first fitting optimal machine learning models for each data stream and then combining the best individual models into a stacked ensemble model. Model accuracy was assessed by comparing predictions of firearm homicides in 2019 to actual firearm homicides identified by National Vital Statistics System death certificates. Results were also compared with a SARIMA (seasonal autoregressive integrated moving average) model, a common method to forecast injury mortality.
Both individual and ensemble models yielded highly accurate estimates of firearm homicides. Individual models' mean error for weekly estimates of firearm homicides (root mean square error) varied from 24.95 for emergency department visits to 31.29 for SARIMA forecasting. Ensemble models combining data sources had lower weekly mean error and higher annual accuracy than individual data sources: the all-source ensemble model had a weekly root mean square error of 24.46 deaths and full-year accuracy of 99.74%, predicting the total number of firearm homicides in 2019 within 38 deaths for the entire year (compared with 95.48% accuracy and 652 deaths for the SARIMA model). The model decreased the time lag of reporting weekly firearm homicides from 7 to 8 months to approximately 6 weeks.
In this prognostic study of diverse secondary data on machine learning, ensemble modeling produced accurate near real-time estimates of weekly and annual firearm homicides and substantially decreased data source time lags. Ensemble model forecasts can accelerate public health practitioners' and policy makers' ability to respond to unanticipated shifts in firearm homicides.
枪支凶杀是一个主要的公共卫生关注点;缺乏及时的死亡率数据对有效应对造成了相当大的挑战。近实时数据源为更及时地估计枪支凶杀提供了潜力。
估计美国每周和每年枪支凶杀的近实时负担。
设计、设置和参与者:在这项预测性研究中,从多个数据源获得了匿名的、纵向的时间序列数据,包括与枪支有关的谷歌和 YouTube 搜索趋势(2014-2019 年)、枪支伤害急诊就诊(国家综合征监测计划,2014-2019 年)、与枪支有关的伤害紧急医疗服务激活(生物空间,2014-2019 年)以及国家家庭暴力热线接触标记为“枪支”关键字(2016-2019 年)。数据分析于 2021 年 9 月至 2022 年 9 月进行。
使用两阶段管道计算美国枪支凶杀的每周估计数,首先为每个数据流拟合最佳机器学习模型,然后将最佳个体模型组合到堆叠集成模型中。通过将 2019 年枪支凶杀的预测与国家生命统计系统死亡证明确定的实际枪支凶杀进行比较,评估了模型的准确性。结果还与 SARIMA(季节性自回归综合移动平均)模型进行了比较,SARIMA 是一种常见的预测伤害死亡率的方法。
个体模型和集成模型都对枪支凶杀进行了高度准确的估计。个体模型对枪支凶杀的每周估计的平均误差(均方根误差)从急诊就诊的 24.95 到 SARIMA 预测的 31.29 不等。结合数据源的集成模型比单个数据源具有更低的每周平均误差和更高的全年准确性:全源集成模型每周的均方根误差为 24.46 人死亡,全年准确率为 99.74%,预测 2019 年全年枪支凶杀总数的准确率为 38 人(与 SARIMA 模型的 95.48%准确率和 652 人死亡相比)。该模型将每周枪支凶杀报告的时间滞后从 7 到 8 个月减少到大约 6 周。
在这项关于机器学习的多样化二级数据的预测性研究中,集成模型产生了对每周和每年枪支凶杀的准确近实时估计,并大大减少了数据源的时间滞后。集成模型预测可以加快公共卫生从业人员和政策制定者对枪支凶杀意外变化的反应能力。