University of Neuchâtel, Institute of Financial Analysis, Neuchâtel, 2000, Switzerland.
Sci Data. 2022 Mar 29;9(1):112. doi: 10.1038/s41597-022-01245-1.
This database provides the daily time-series of COVID-19 cases, deaths, recovered people, tests, vaccinations, and hospitalizations, for more than 230 countries, 760 regions, and 12,000 lower-level administrative divisions. The geographical entities are associated with identifiers to match with hydrometeorological, geospatial, and mobility data. The database includes policy measures at the national and, when available, sub-national levels. The data acquisition pipeline is open-source and fully automated. As most governments revise the data retrospectively, the database always updates the complete time-series to mirror the original source. Vintage data, immutable snapshots of the data taken each day, are provided to ensure research reproducibility. The latest data are updated on an hourly basis, and the vintage data are available since April 14, 2020. All the data are available in CSV files or SQLite format. By unifying the access to the data, this work makes it possible to study the pandemic on a global scale with high resolution, taking into account within-country variations, nonpharmaceutical interventions, and environmental and exogenous variables.
该数据库提供了超过 230 个国家、760 个地区和 12000 个低级别行政区的 COVID-19 病例、死亡、康复人员、检测、疫苗接种和住院治疗的每日时间序列数据。地理实体与标识符相关联,以与水文气象、地理空间和流动数据相匹配。该数据库包括国家一级和(如适用)国家以下各级别的政策措施。数据采集管道是开源的,完全自动化的。由于大多数政府都对数据进行回溯修正,因此该数据库始终会更新完整的时间序列,以反映原始来源。每天都会提供陈旧数据(对数据进行的不可变快照),以确保研究的可重复性。最新数据每小时更新一次,自 2020 年 4 月 14 日起提供陈旧数据。所有数据都以 CSV 文件或 SQLite 格式提供。通过统一访问数据,这项工作使得可以在全球范围内以高分辨率研究大流行,同时考虑到国内差异、非药物干预措施以及环境和外生变量。