Institute of Automation Technology, Helmut-Schmidt-University, 22043 Hamburg, Germany.
Fraunhofer Institute of Optronics, System Technologies and Image Exploitation IOSB, 76131 Karlsruhe, Germany.
Sensors (Basel). 2021 Mar 30;21(7):2397. doi: 10.3390/s21072397.
In the field of Cyber-Physical Systems (CPS), there is a large number of machine learning methods, and their intrinsic hyper-parameters are hugely varied. Since no agreed-on datasets for CPS exist, developers of new algorithms are forced to define their own benchmarks. This leads to a large number of algorithms each claiming benefits over other approaches but lacking a fair comparison. To tackle this problem, this paper defines a novel model for a generation process of data, similar to that found in CPS. The model is based on well-understood system theory and allows many datasets with different characteristics in terms of complexity to be generated. The data will pave the way for a comparison of selected machine learning methods in the exemplary field of unsupervised learning. Based on the synthetic CPS data, the data generation process is evaluated by analyzing the performance of the methods of the Self-Organizing Map, One-Class Support Vector Machine and Long Short-Term Memory Neural Net in anomaly detection.
在网络物理系统(CPS)领域,有大量的机器学习方法,它们的内在超参数差异很大。由于没有针对 CPS 的约定数据集,因此新算法的开发人员被迫定义自己的基准。这导致许多算法都声称比其他方法更有优势,但缺乏公平的比较。为了解决这个问题,本文为数据生成过程定义了一个新的模型,类似于 CPS 中发现的模型。该模型基于成熟的系统理论,并允许生成具有不同复杂性特征的多个数据集。这些数据将为在无监督学习的典型领域中对选定的机器学习方法进行比较铺平道路。基于合成的 CPS 数据,通过分析自组织映射、单类支持向量机和长短期记忆神经网络在异常检测中的方法的性能,对数据生成过程进行了评估。