National Library of Medicine, Lister Hill National Center for Biomedical Communications, Bethesda, Maryland, United States.
Methods Inf Med. 2023 Sep;62(3-04):100-109. doi: 10.1055/a-2015-1244. Epub 2023 Jan 18.
Public health emergencies leave little time to develop novel surveillance efforts. Understanding which preexisting clinical datasets are fit for surveillance use is of high value. Coronavirus disease 2019 (COVID-19) offers a natural applied informatics experiment to understand the fitness of clinical datasets for use in disease surveillance.
This study evaluates the agreement between legacy surveillance time series data and discovers their relative fitness for use in understanding the severity of the COVID-19 emergency. Here fitness for use means the statistical agreement between events across series.
Thirteen weekly clinical event series from before and during the COVID-19 era for the United States were collected and integrated into a (multi) time series event data model. The Centers for Disease Control and Prevention (CDC) COVID-19 attributable mortality, CDC's excess mortality model, national Emergency Medical Services (EMS) calls, and Medicare encounter level claims were the data sources considered in this study. Cases were indexed by week from January 2015 through June of 2021 and fit to Distributed Random Forest models. Models returned the variable importance when predicting the series of interest from the remaining time series.
Model r2 statistics ranged from 0.78 to 0.99 for the share of the volumes predicted correctly. Prehospital data were of high value, and cardiac arrest (CA) prior to EMS arrival was on average the best predictor (tied with study week). COVID-19 Medicare claims volumes can predict COVID-19 death certificates (agreement), while viral respiratory Medicare claim volumes cannot predict Medicare COVID-19 claims (disagreement).
Prehospital EMS data should be considered when evaluating the severity of COVID-19 because prehospital CA known to EMS was the strongest predictor on average across indices.
公共卫生紧急事件几乎没有时间来开发新的监测手段。了解哪些预先存在的临床数据集适合用于监测具有很高的价值。2019 年冠状病毒病(COVID-19)为理解临床数据集在疾病监测中的适用性提供了一个自然的应用信息学实验。
本研究评估了遗留监测时间序列数据之间的一致性,并发现它们在用于了解 COVID-19 紧急情况严重程度方面的相对适用性。这里的适用性是指跨系列事件的统计一致性。
收集了美国 COVID-19 时代之前和期间的 13 个每周临床事件系列,并将其整合到(多)时间序列事件数据模型中。本研究考虑的数据源包括疾病控制与预防中心(CDC)归因于 COVID-19 的死亡率、CDC 的超额死亡率模型、国家紧急医疗服务(EMS)呼叫和医疗保险遭遇级别的索赔。病例以从 2015 年 1 月到 2021 年 6 月的周为索引,并拟合到分布式随机森林模型中。模型返回了从剩余时间序列中预测感兴趣系列时的变量重要性。
模型 r2 统计量范围为 0.78 到 0.99,用于预测正确的份额。院前数据具有很高的价值,并且 EMS 到达前的心脏骤停(CA)平均是最佳预测因素(与研究周并列)。COVID-19 医疗保险索赔量可以预测 COVID-19 死亡证明(一致性),而病毒呼吸道医疗保险索赔量不能预测医疗保险 COVID-19 索赔(不一致)。
在评估 COVID-19 的严重程度时应考虑院前 EMS 数据,因为院前 CA 已知对 EMS 是平均而言是所有指标中最强的预测因素。