TECNALIA, Basque Research & Technology Alliance (BRTA), 48160 Derio, Spain.
Department of Communications Engineering, Faculty of Engineering, University of the Basque Country (UPV/EHU), 48013 Bilbao, Spain.
Sensors (Basel). 2020 Nov 24;20(23):6712. doi: 10.3390/s20236712.
In the smart city context, Big Data analytics plays an important role in processing the data collected through IoT devices. The analysis of the information gathered by sensors favors the generation of specific services and systems that not only improve the quality of life of the citizens, but also optimize the city resources. However, the difficulties of implementing this entire process in real scenarios are manifold, including the huge amount and heterogeneity of the devices, their geographical distribution, and the complexity of the necessary IT infrastructures. For this reason, the main contribution of this paper is the PADL description language, which has been specifically tailored to assist in the definition and operationalization phases of the machine learning life cycle. It provides annotations that serve as an abstraction layer from the underlying infrastructure and technologies, hence facilitating the work of data scientists and engineers. Due to its proficiency in the operationalization of distributed pipelines over edge, fog, and cloud layers, it is particularly useful in the complex and heterogeneous environments of smart cities. For this purpose, PADL contains functionalities for the specification of monitoring, notifications, and actuation capabilities. In addition, we provide tools that facilitate its adoption in production environments. Finally, we showcase the usefulness of the language by showing the definition of PADL-compliant analytical pipelines over two uses cases in a smart city context (flood control and waste management), demonstrating that its adoption is simple and beneficial for the definition of information and process flows in such environments.
在智慧城市的背景下,大数据分析在处理通过物联网设备收集的数据方面发挥着重要作用。对传感器收集的信息进行分析有利于生成特定的服务和系统,这些服务和系统不仅可以提高市民的生活质量,还可以优化城市资源。然而,在实际场景中实施这整个过程存在多方面的困难,包括设备数量庞大且异构、它们的地理分布以及必要的 IT 基础设施的复杂性。出于这个原因,本文的主要贡献是 PADL 描述语言,它专门用于辅助机器学习生命周期的定义和操作阶段。它提供了注释,作为底层基础设施和技术的抽象层,从而为数据科学家和工程师的工作提供了便利。由于它擅长在边缘、雾和云层上对分布式管道进行操作,因此它在智慧城市的复杂和异构环境中特别有用。为此,PADL 包含了用于规范监控、通知和致动功能的功能。此外,我们提供了工具,以方便在生产环境中采用它。最后,我们通过展示在智慧城市背景下的两个用例(洪水控制和废物管理)中定义符合 PADL 的分析管道,展示了该语言的有用性,证明了其采用对于在这种环境中定义信息和流程流是简单且有益的。