Suppr超能文献

基于 LSTM 的时间序列数据填充方法——以茎含水率为例。

A Time Series Data Filling Method Based on LSTM-Taking the Stem Moisture as an Example.

机构信息

School of Technology, Beijing Forestry University, Beijing 100083, China.

Beijing Laboratory of Urban and Rural Ecological Environment, Beijing Forestry University, Beijing 100083, China.

出版信息

Sensors (Basel). 2020 Sep 5;20(18):5045. doi: 10.3390/s20185045.

Abstract

In order to solve the problem of data loss in sensor data collection, this paper took the stem moisture data of plants as the object, and compared the filling value of missing data in the same data segment with different data filling methods to verify the validity and accuracy of the stem water filling data of the LSTM (Long Short-Term Memory) model. This paper compared the accuracy of missing stem water data for plants under different data filling methods to solve the problem of data loss in sensor data collection. Original stem moisture data was selected from which was planted in the Haidian District of Beijing in June 2017. Part of the data which treated as missing data was manually deleted. Interpolation methods, time series statistical methods, the RNN (Recurrent Neural Network), and LSTM neural network were used to fill in the missing part and the filling results were compared with the original data. The result shows that the LSTM has more accurate performance than the RNN. The error values of the bidirectional LSTM model are the smallest among several models. The error values of the bidirectional LSTM are much lower than other methods. The MAPE (mean absolute percent error) of the bidirectional LSTM model is 1.813%. After increasing the length of the training data, the results further proved the effectiveness of the model. Further, in order to solve the problem of one-dimensional filling error accumulation, the LSTM model is used to conduct the multi-dimensional filling experiment with environmental data. After comparing the filling results of different environmental parameters, three environmental parameters of air humidity, photosynthetic active radiation, and soil temperature were selected as input. The results show that the multi-dimensional filling can greatly extend the sequence length while maintaining the accuracy, and make up for the defect that the one-dimensional filling accumulates errors with the increase of the sequence. The minimum MAPE of multidimensional filling is 1.499%. In conclusion, the data filling method based on LSTM neural network has a great advantage in filling the long-lost time series data which would provide a new idea for data filling.

摘要

为了解决传感器数据采集过程中的数据丢失问题,本文以植物茎部水分数据为对象,对比同一数据段内不同数据填充方法对缺失数据的填充值,验证 LSTM(长短期记忆)模型对植物茎部水分填充数据的有效性和准确性。通过对比不同数据填充方法下植物缺失茎部水分数据的准确性,解决传感器数据采集过程中的数据丢失问题。原始茎部水分数据选自 2017 年 6 月在北京海淀区种植的植物。部分数据被视为缺失数据并被手动删除。本文使用插值方法、时间序列统计方法、RNN(递归神经网络)和 LSTM 神经网络来填充缺失部分,并将填充结果与原始数据进行比较。结果表明,LSTM 比 RNN 具有更高的性能。双向 LSTM 模型的误差值在几个模型中最小。双向 LSTM 的误差值远低于其他方法。双向 LSTM 模型的 MAPE(平均绝对百分比误差)为 1.813%。增加训练数据的长度后,结果进一步证明了模型的有效性。此外,为了解决一维填充误差累积的问题,使用 LSTM 模型进行环境数据的多维填充实验。在对比不同环境参数的填充结果后,选择空气湿度、光合有效辐射和土壤温度三个环境参数作为输入。结果表明,多维填充可以在保持精度的同时大大延长序列长度,并弥补一维填充随序列长度增加而累积误差的缺陷。多维填充的最小 MAPE 为 1.499%。总之,基于 LSTM 神经网络的数据填充方法在填充长时间丢失的时间序列数据方面具有很大的优势,为数据填充提供了新的思路。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/562f/7571071/1b30e762f6d2/sensors-20-05045-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验