Suppr超能文献

基于 CNN-BiLSTM 的微博评论情感分析模型。

Emotion Analysis Model of Microblog Comment Text Based on CNN-BiLSTM.

机构信息

College of Information and Electrical Engineering, Heilongjiang Bayi Agricultural University, Daqing, Heilongjiang 163319, China.

Engineering Research Center of Processing and Utilization of Grain By-Products, Ministry of Education, Heilongjiang Engineering Technology Research Center for Rice Ecological Seedlings Device and Whole Process Mechanization, Daqing, Heilongjiang 163319, China.

出版信息

Comput Intell Neurosci. 2022 Apr 30;2022:1669569. doi: 10.1155/2022/1669569. eCollection 2022.

Abstract

Aiming at the problems of over reliance on labor and low generalization of traditional emotion analysis methods based on dictionary and machine learning, an emotion analysis model of microblog comment text based on deep learning is proposed. Firstly, text is obtained through microblog crawler program. After data preprocessing, including data cleaning, Chinese word segmentation, removal of stop words, and so on, the Skip-gram model is used for word vector training on a large-scale unmarked corpus, and then the trained word vector is used as the text input of CNN-BiLSTM model, which combines Bidirectional Long-Short Term Memory (BiLSTM) neural network and Convolution Neural Network (CNN). Considering the historical context information and the subsequent context information, BiLSTM can better use the temporal relationship of text to learn sentence semantics. CNN can extract hidden features from the text and combine them. Finally, after Adamax optimization training, the emotion type of microblog comment text is output. The proposed model combines the learning advantages of BiLSTM and CNN. The overall accuracy of text emotion analysis has been greatly improved, with an accuracy of 0.94 and an improvement of 8.51% compared with the single CNN model.

摘要

针对传统基于词典和机器学习的情感分析方法过度依赖人工和泛化能力低的问题,提出了一种基于深度学习的微博评论情感分析模型。首先,通过微博爬虫程序获取文本,经过数据预处理,包括数据清洗、中文分词、去除停用词等,在大规模无标记语料上使用 Skip-gram 模型进行词向量训练,然后将训练好的词向量作为 CNN-BiLSTM 模型的文本输入,该模型结合了双向长短时记忆(BiLSTM)神经网络和卷积神经网络(CNN)。BiLSTM 可以更好地利用文本的时间关系来学习句子语义,考虑到历史上下文信息和后续上下文信息。CNN 可以从文本中提取隐藏特征并进行组合。最后,经过 Adamax 优化训练,输出微博评论文本的情感类型。所提出的模型结合了 BiLSTM 和 CNN 的学习优势,大大提高了文本情感分析的整体准确性,与单一 CNN 模型相比,准确率提高了 0.94,提高了 8.51%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f349/9078776/35ce0f1d2167/CIN2022-1669569.001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验