Arribas-Bel Dani, Green Mark, Rowe Francisco, Singleton Alex
Geographic Data Science Lab, Department of Geography and Planning, University of Liverpool, Roxby Building, 74, Bedford St S., Liverpool, L69 7ZT UK.
J Geogr Syst. 2021;23(4):497-514. doi: 10.1007/s10109-021-00363-5. Epub 2021 Oct 20.
This paper develops the notion of "open data product". We define an open data product as the open result of the processes through which a variety of data (open and not) are turned into accessible information through a service, infrastructure, analytics or a combination of all of them, where each step of development is designed to promote open principles. Open data products are born out of a (data) need and add value beyond simply publishing existing datasets. We argue that the process of adding value should adhere to the principles of open (geographic) data science, ensuring openness, transparency and reproducibility. We also contend that outreach, in the form of active communication and dissemination through dashboards, software and publication are key to engage end-users and ensure societal impact. Open data products have major benefits. First, they enable insights from highly sensitive, controlled and/or secure data which may not be accessible otherwise. Second, they can expand the use of commercial and administrative data for the public good leveraging on their high temporal frequency and geographic granularity. We also contend that there is a compelling need for open data products as we experience the current data revolution. New, emerging data sources are unprecedented in temporal frequency and geographical resolution, but they are large, unstructured, fragmented and often hard to access due to privacy and confidentiality concerns. By transforming raw (open or "closed") data into ready to use open data products, new dimensions of human geographical processes can be captured and analysed, as we illustrate with existing examples. We conclude by arguing that several parallels exist between the role that open source software played in enabling research on spatial analysis in the 90 s and early 2000s, and the opportunities that open data products offer to unlock the potential of new forms of (geo-)data.
本文提出了“开放数据产品”的概念。我们将开放数据产品定义为一系列过程的开放成果,在这些过程中,各种数据(开放的和非开放的)通过服务、基础设施、分析或它们的组合转化为可获取的信息,且开发的每一步都旨在促进开放原则。开放数据产品源于(数据)需求,其价值不仅仅在于简单地发布现有数据集。我们认为,增值过程应遵循开放(地理)数据科学的原则,确保开放性、透明度和可重复性。我们还主张,通过仪表盘、软件和出版物进行积极沟通与传播的形式进行推广,是吸引终端用户并确保社会影响的关键。开放数据产品具有诸多重大益处。首先,它们能从高度敏感、受控制和/或安全的数据中获取见解,否则这些数据可能无法获取。其次,它们可以利用商业和行政数据的高时间频率和地理粒度,将其用于公共利益从而扩大其用途。我们还认为,鉴于我们正在经历当前的数据革命,对开放数据产品有着迫切需求。新出现的数据源在时间频率和地理分辨率方面前所未有的,但它们规模庞大、非结构化、零散,且由于隐私和保密问题往往难以获取。正如我们通过现有示例所说明的,通过将原始(开放或“封闭”)数据转化为随时可用的开放数据产品,可以捕捉和分析人类地理过程的新维度。我们最后指出,开源软件在20世纪90年代和21世纪初推动空间分析研究方面所发挥的作用,与开放数据产品为释放新形式(地理)数据潜力所提供的机会之间存在若干相似之处。