Department of Mechanical Engineering, National Cheng Kung University, No. 1 University Rd., Tainan, 701, Taiwan.
Department of Applied Physics and Electronics, Umeå University, Umeå, 90187, Sweden.
Sci Data. 2024 Jul 24;11(1):821. doi: 10.1038/s41597-024-03627-z.
The COVID-19 pandemic has flooded open databases with population-level data. However, individual-level structured data, such as the course of disease and contact tracing information, is almost non-existent in open databases. Publish a structured and cleaned COVID-19 dataset with the course of disease and contact tracing information for easy benchmarking of COVID-19 models. We gathered data from Taiwanese open databases and daily news reports. The outcome is a structured quantitative dataset encompassing the course of the disease of Taiwanese individuals, alongside their contact tracing information. Our dataset comprises 579 confirmed cases covering the period from January 21, to November 9, 2020, when the original SARS-CoV-2 virus was most prevalent in Taiwan. The data include features such as travel history, age, gender, symptoms, contact types between cases, date of symptoms onset, confirmed, critically ill, recovered, and dead. We also include the daily summary data at population-level from January 21, 2020, to May 23, 2022. Our data can help enhance epidemiological modelling.
新冠疫情大流行使得大量人群层面的数据涌入开放数据库。然而,开放数据库中几乎没有疾病进程和接触者追踪信息等个体层面的结构化数据。我们发布了一个带有疾病进程和接触者追踪信息的结构化和清理后的 COVID-19 数据集,以便于 COVID-19 模型的基准测试。我们从台湾的开放数据库和每日新闻报道中收集数据。结果是一个结构化的定量数据集,包含了台湾个体的疾病进程以及他们的接触者追踪信息。我们的数据集包括了 579 例确诊病例,涵盖了 2020 年 1 月 21 日至 11 月 9 日期间 SARS-CoV-2 病毒在台湾最为流行的时期。数据包括旅行史、年龄、性别、症状、病例间的接触类型、症状出现日期、确诊、重症、康复和死亡等特征。我们还包括了 2020 年 1 月 21 日至 2022 年 5 月 23 日期间的人群层面的每日汇总数据。我们的数据可以帮助加强流行病学建模。