Department of Public Health, Simmons University, Boston, Massachusetts, United States of America.
Department of Data Science and Neuroscience, Simmons University, Boston, Massachusetts, United States of America.
PLoS One. 2022 Nov 3;17(11):e0251470. doi: 10.1371/journal.pone.0251470. eCollection 2022.
The rapid proliferation of COVID-19 has left governments scrambling, and several data aggregators are now assisting in the reporting of county cases and deaths. The different variables affecting reporting (e.g., time delays in reporting) necessitates a well-documented reliability study examining the data methods and discussion of possible causes of differences between aggregators.
To statistically evaluate the reliability of COVID-19 data across aggregators using case fatality rate (CFR) estimates and reliability statistics.
DESIGN, SETTING, AND PARTICIPANTS: Cases and deaths were collected daily by volunteers via state and local health departments, as primary sources and newspaper reports, as secondary sources. In an effort to begin comparison for reliability statistical analysis, BroadStreet collected data from other COVID-19 aggregator sources, including USAFacts, Johns Hopkins University, New York Times, The COVID Tracking Project.
COVID-19 cases and death counts at the county and state levels.
Lower levels of inter-rater agreement were observed across aggregators associated with the number of deaths, which manifested itself in state level Bayesian estimates of COVID-19 fatality rates.
A national, publicly available data set is needed for current and future disease outbreaks and improved reliability in reporting.
COVID-19 的迅速传播使得各国政府措手不及,现在有几个数据聚合器正在协助报告县病例和死亡人数。影响报告的不同变量(例如,报告的时间延迟)需要进行记录良好的可靠性研究,以检查数据方法,并讨论聚合器之间差异的可能原因。
使用病死率(CFR)估计和可靠性统计数据,对 COVID-19 数据在聚合器之间的可靠性进行统计评估。
设计、地点和参与者:病例和死亡数据由志愿者通过州和地方卫生部门(作为主要来源)和报纸报道(作为次要来源)每天收集。为了开始进行可靠性统计分析的比较,BroadStreet 从其他 COVID-19 聚合器来源(包括 USAFacts、约翰霍普金斯大学、纽约时报、COVID 追踪项目)收集数据。
县和州一级的 COVID-19 病例和死亡人数。
与死亡人数相关的聚合器之间的观察者间一致性水平较低,这表现在州级 COVID-19 病死率的贝叶斯估计中。
需要为当前和未来的疾病爆发建立一个全国性的、公开可用的数据集,以提高报告的可靠性。