Suppr超能文献

基于深度学习的通过推文对公众在家工作看法的情感分析。

Deep learning based sentiment analysis of public perception of working from home through tweets.

作者信息

Vohra Aarushi, Garg Ritu

机构信息

Department of Computer Engineering, National Institute of Technology Kurukshetra, 136119 Kurukshetra, Haryana India.

出版信息

J Intell Inf Syst. 2023;60(1):255-274. doi: 10.1007/s10844-022-00736-2. Epub 2022 Aug 24.

Abstract

Nowadays, we are witnessing a paradigm shift from the conventional approach of working from office spaces to the emerging culture of working virtually from home. Even during the COVID-19 pandemic, many organisations were forced to allow employees to work from their homes, which led to worldwide discussions of this trend on Twitter. The analysis of this data has immense potential to change the way we work but extracting useful information from this valuable data is a challenge. Hence in this study, the microblogging website Twitter is used to gather more than 450,000 English language tweets from 22nd January 2022 to 12th March 2022, consisting of keywords related to working from home. A state-of-the-art pre-processing technique is used to convert all emojis into text, remove duplicate tweets, retweets, username tags, URLs, hashtags etc. and then the text is converted to lowercase. Thus, the number of tweets is reduced to 358,823. In this paper, we propose a fine-tuned Convolutional Neural Network (CNN) model to analyse Twitter data. The input to our deep learning model is an annotated set of tweets that are effectively labelled into three sentiment classes, viz. positive negative and neutral using VADER (Valence Aware Dictionary for sEntiment Reasoning). We also use a variation in the input vector to the embedding layer, by using FastText embeddings with our model to train supervised word representations for our text corpus of more than 450,000 tweets. The proposed model uses multiple convolution and max pooling layers, dropout operation, and dense layers with ReLU and sigmoid activations to achieve remarkable results on our dataset. Further, the performance of our model is compared with some standard classifiers like Support Vector Machine (SVM), Naive Bayes, Decision Tree, and Random Forest. From the results, it is observed that on the given dataset, the proposed CNN with FastText word embeddings outperforms other classifiers with an accuracy of 0.925969. As a result of this classification, 54.41% of the tweets are found to show affirmation, 24.50% show a negative disposition, and 21.09% have neutral sentiments towards working from home.

摘要

如今,我们正目睹一种范式转变,从传统的在办公场所工作的方式转向新兴的在家远程工作文化。即使在新冠疫情期间,许多组织也被迫允许员工在家工作,这引发了推特上关于这一趋势的全球讨论。对这些数据的分析具有改变我们工作方式的巨大潜力,但从这些宝贵数据中提取有用信息是一项挑战。因此,在本研究中,微博网站推特被用于收集2022年1月22日至2022年3月12日期间超过45万条与在家工作相关关键词的英语推文。一种先进的预处理技术被用于将所有表情符号转换为文本,去除重复推文、转发、用户名标签、网址、主题标签等,然后将文本转换为小写。这样,推文数量减少到358,823条。在本文中,我们提出一种微调的卷积神经网络(CNN)模型来分析推特数据。我们深度学习模型的输入是一组经过注释的推文,这些推文使用VADER(用于情感推理的价态感知词典)被有效地标记为三个情感类别,即积极、消极和中性。我们还通过在模型中使用FastText嵌入来训练我们超过45万条推文的文本语料库的有监督单词表示,从而对输入到嵌入层的向量进行了一种变体处理。所提出的模型使用多个卷积层和最大池化层、随机失活操作以及带有ReLU和sigmoid激活函数的全连接层,在我们的数据集上取得了显著成果。此外,我们将模型的性能与一些标准分类器进行了比较,如支持向量机(SVM)、朴素贝叶斯、决策树和随机森林。从结果可以看出,在给定数据集上,所提出的带有FastText单词嵌入的CNN模型以0.925969的准确率优于其他分类器。通过这种分类,发现54.41%的推文表现出肯定态度,24.50%表现出消极倾向,21.09%对在家工作持中性态度。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8665/9399597/308bf8a37f09/10844_2022_736_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验