Zhu Yan, Zhou Rui, Chen Gang, Zhang Baili
School of Design and Art, Shanghai Dianji University, Shanghai, Shanghai, China.
Infrastructure Technology Business Group, Ant Group, Hangzhou, Zhejiang, China.
PeerJ Comput Sci. 2024 Dec 5;10:e2542. doi: 10.7717/peerj-cs.2542. eCollection 2024.
Traditional statistical learning-based sentiment analysis methods often struggle to effectively handle text relevance and temporality. To overcome these limitations, this paper proposes a novel approach integrating Latent Dirichlet Allocation (LDA), Shuffle-enhanced Real-Valued Non-Volume Preserving (RealNVP), a double-layer bidirectional improved Long Short-Term Memory (DBiLSTM) network, and a multi-head self-attention mechanism for sentiment analysis. LDA is employed to extract latent topics within comment texts, revealing text relevance and providing fine-grained user feedback. Shuffle enhancement is applied to RealNVP to effectively model the distribution of text topic features, enhancing performance while avoiding excessive complexity in model structure and computational overhead. The double-layer bidirectional improved LSTM, through the coupling of forget and input gates, captures the dynamic temporal changes in sentiment with greater flexibility. The multi-head self-attention mechanism enhances the model's ability to select and focus on key information, thereby more accurately reflecting user experiences. Experimental results on both Chinese and English online comment datasets demonstrate that the proposed integrated model achieves improved topic coherence compared to traditional LDA models, effectively mitigating overfitting. Furthermore, the model outperforms single models and other baselines in sentiment classification tasks, as evidenced by superior accuracy and F1 scores. These results underscore the model's effectiveness for both Chinese and English sentiment analysis in the context of online comments.
传统的基于统计学习的情感分析方法往往难以有效处理文本相关性和时效性。为了克服这些局限性,本文提出了一种新颖的方法,将潜在狄利克雷分配(LDA)、洗牌增强实值非体积保持(RealNVP)、双层双向改进长短期记忆(DBiLSTM)网络和多头自注意力机制集成用于情感分析。LDA用于提取评论文章中的潜在主题,揭示文本相关性并提供细粒度的用户反馈。对RealNVP应用洗牌增强,以有效建模文本主题特征的分布,在避免模型结构过度复杂和计算开销的同时提高性能。双层双向改进LSTM通过遗忘门和输入门的耦合,更灵活地捕捉情感的动态时间变化。多头自注意力机制增强了模型选择和关注关键信息的能力,从而更准确地反映用户体验。在中文和英文在线评论数据集上的实验结果表明,与传统LDA模型相比,所提出的集成模型实现了更高的主题连贯性,有效减轻了过拟合。此外,该模型在情感分类任务中优于单一模型和其他基线,具有更高的准确率和F1分数。这些结果强调了该模型在在线评论背景下对中文和英文情感分析的有效性。