Cai Miaozhuang, Zheng Yin, Peng Zhengyang, Huang Chunyan, Jiang Haoxia
Guangzhou Power Supply Bureau, Guangdong Power Grid Company, Guangzhou, China.
Guangzhou Benliu Power Technology Company, Guangzhou, China.
PLoS One. 2024 Jun 13;19(6):e0303977. doi: 10.1371/journal.pone.0303977. eCollection 2024.
Time series data complexity presents new challenges in clustering analysis across fields such as electricity, energy, industry, and finance. Despite advances in representation learning and clustering with Variational Autoencoders (VAE) based deep learning techniques, issues like the absence of discriminative power in feature representation, the disconnect between instance reconstruction and clustering objectives, and scalability challenges with large datasets persist. This paper introduces a novel deep time series clustering approach integrating VAE with metric learning. It leverages a VAE based on Gated Recurrent Units for temporal feature extraction, incorporates metric learning for joint optimization of latent space representation, and employs the sum of log likelihoods as the clustering merging criterion, markedly improving clustering accuracy and interpretability. Experimental findings demonstrate a 27.16% improvement in average clustering accuracy and a 47.15% increase in speed on industrial load data. This study offers novel insights and tools for the thorough analysis and application of time series data, with further exploration of VAE's potential in time series clustering anticipated in future research.
时间序列数据的复杂性在电力、能源、工业和金融等领域的聚类分析中带来了新的挑战。尽管基于变分自编码器(VAE)的深度学习技术在表示学习和聚类方面取得了进展,但诸如特征表示中缺乏判别力、实例重建与聚类目标之间的脱节以及大数据集的可扩展性挑战等问题仍然存在。本文介绍了一种将VAE与度量学习相结合的新型深度时间序列聚类方法。它利用基于门控循环单元的VAE进行时间特征提取,纳入度量学习以联合优化潜在空间表示,并采用对数似然之和作为聚类合并标准,显著提高了聚类准确性和可解释性。实验结果表明,在工业负载数据上,平均聚类准确率提高了27.16%,速度提高了47.15%。本研究为时间序列数据的深入分析和应用提供了新颖的见解和工具,未来研究有望进一步探索VAE在时间序列聚类中的潜力。