Pan Zhuofu, Wang Yalin, Wang Kai, Chen Hongtian, Yang Chunhua, Gui Weihua
IEEE Trans Cybern. 2023 Feb;53(2):695-706. doi: 10.1109/TCYB.2022.3167995. Epub 2023 Jan 13.
Missing values are ubiquitous in industrial data sets because of multisampling rates, sensor faults, and transmission failures. The incomplete data obstruct the effective use of data and degrade the performance of data-driven models. Numerous imputation algorithms have been proposed to deal with missing values, primarily based on supervised learning, that is, imputing the missing values by constructing a prediction model with the remaining complete data. They have limited performance when the amount of incomplete data is overwhelming. Moreover, many methods have not considered the autocorrelation of time-series data. Thus, an adaptive-learned median-filled deep autoencoder (AM-DAE) is proposed in this study, aiming to impute missing values of industrial time-series data in an unsupervised manner. It continuously replaces the missing values by the median of the input data and its reconstruction, which allows the imputation information to be transmitted with the training process. In addition, an adaptive learning strategy is adopted to guide the AM-DAE paying more attention to the reconstruction learning of nonmissing values or missing values in different iteration periods. Finally, two industrial examples are used to verify the superior performance of the proposed method compared with other advanced techniques.
由于多采样率、传感器故障和传输故障,缺失值在工业数据集中普遍存在。不完整的数据阻碍了数据的有效利用,并降低了数据驱动模型的性能。已经提出了许多插补算法来处理缺失值,主要基于监督学习,即通过使用其余完整数据构建预测模型来插补缺失值。当不完整数据量过大时,它们的性能有限。此外,许多方法没有考虑时间序列数据的自相关性。因此,本研究提出了一种自适应学习的中位数填充深度自动编码器(AM-DAE),旨在以无监督方式插补工业时间序列数据的缺失值。它通过输入数据及其重建的中位数不断替换缺失值,这使得插补信息能够在训练过程中传递。此外,采用自适应学习策略来指导AM-DAE在不同迭代周期中更加关注非缺失值或缺失值的重建学习。最后,使用两个工业实例来验证所提方法与其他先进技术相比的优越性能。