Department of Biostatistics and Epidemiology, University of Massachusetts-Amherst, Amherst, MA, 01003, USA.
College of Information and Computer Sciences, University of Massachusetts-Amherst, Amherst, MA, 01003, USA.
Sci Data. 2021 Feb 11;8(1):59. doi: 10.1038/s41597-021-00839-5.
Forecasting has emerged as an important component of informed, data-driven decision-making in a wide array of fields. We introduce a new data model for probabilistic predictions that encompasses a wide range of forecasting settings. This framework clearly defines the constituent parts of a probabilistic forecast and proposes one approach for representing these data elements. The data model is implemented in Zoltar, a new software application that stores forecasts using the data model and provides standardized API access to the data. In one real-time case study, an instance of the Zoltar web application was used to store, provide access to, and evaluate real-time forecast data on the order of 10 rows, provided by over 40 international research teams from academia and industry making forecasts of the COVID-19 outbreak in the US. Tools and data infrastructure for probabilistic forecasts, such as those introduced here, will play an increasingly important role in ensuring that future forecasting research adheres to a strict set of rigorous and reproducible standards.
预测已成为众多领域中基于数据、明智决策的重要组成部分。我们引入了一种新的数据模型,用于涵盖广泛预测场景的概率预测。该框架明确定义了概率预测的组成部分,并提出了一种表示这些数据元素的方法。该数据模型在 Zoltar 中实现,这是一个新的软件应用程序,它使用该数据模型存储预测,并提供对数据的标准化 API 访问。在一个实时案例研究中,使用 Zoltar 网络应用程序的一个实例来存储、提供对来自学术界和行业的 40 多个国际研究团队提供的 10 行左右实时预测数据的访问,并评估这些数据,这些数据是对美国 COVID-19 疫情的预测。这里介绍的概率预测工具和数据基础架构将在确保未来预测研究遵守严格的一系列严格和可重复的标准方面发挥越来越重要的作用。