Vigoya Laura, Fernandez Diego, Carneiro Victor, Cacheda Fidel
Centre for Information and Communications Technology Research (CITIC), Campus de Elviña s/n, 15071 A Coruña, Spain.
Sensors (Basel). 2020 Jul 4;20(13):3745. doi: 10.3390/s20133745.
The relative simplicity of IoT networks extends service vulnerabilities and possibilities to different network failures exhibiting system weaknesses. Therefore, having a dataset with a sufficient number of samples, labeled and with a systematic analysis, is essential in order to understand how these networks behave and detect traffic anomalies. This work presents DAD: a complete and labeled IoT dataset containing a reproduction of certain real-world behaviors as seen from the network. To approximate the dataset to a real environment, the data were obtained from a physical data center, with temperature sensors based on NFC smart passive sensor technology. Having carried out different approaches, performing mathematical modeling using time series was finally chosen. The virtual infrastructure necessary for the creation of the dataset is formed by five virtual machines, a MQTT broker and four client nodes, each of them with four sensors of the refrigeration units connected to the internal IoT network. DAD presents a seven day network activity with three types of anomalies: duplication, interception and modification on the MQTT message, spread over 5 days. Finally, a feature description is performed, so it can be used for the application of the various techniques of prediction or automatic classification.
物联网网络相对简单,这将服务漏洞和可能性扩展到了表现出系统弱点的不同网络故障。因此,拥有一个包含足够数量样本、有标签且经过系统分析的数据集,对于理解这些网络的行为方式以及检测流量异常至关重要。这项工作展示了DAD:一个完整且有标签的物联网数据集,其中包含从网络角度重现的某些现实世界行为。为了使数据集接近真实环境,数据是从一个物理数据中心获取的,该数据中心配备了基于NFC智能无源传感器技术的温度传感器。在尝试了不同方法后,最终选择了使用时间序列进行数学建模。创建该数据集所需的虚拟基础设施由五台虚拟机、一个MQTT代理和四个客户端节点组成,每个客户端节点都连接着四个制冷单元传感器到内部物联网网络。DAD展示了为期七天的网络活动,其中包含三种类型的异常:MQTT消息的重复、拦截和修改,这些异常分布在五天内。最后,进行了特征描述,以便可将其用于各种预测或自动分类技术的应用。