文献检索，用中文搜 PubMed

The evolution of the coronavirus (COVID-19) disease took a toll on the social, healthcare, economic, and psychological prosperity of human beings. In the past couple of months, many organizations, individuals, and governments have adopted Twitter to convey their sentiments on COVID-19, the lockdown, the pandemic, and hashtags. This paper aims to analyze the psychological reactions and discourse of Twitter users related to COVID-19. In this experiment, Latent Dirichlet Allocation (LDA) has been used for topic modeling. In addition, a Bidirectional Long Short-Term Memory (BiLSTM) model and various classification techniques such as random forest, support vector machine, logistic regression, naive Bayes, decision tree, logistic regression with stochastic gradient descent optimizer, and majority voting classifier have been adapted for analyzing the polarity of sentiment. The effectiveness of the aforesaid approaches along with LDA modeling has been tested, validated, and compared with several benchmark datasets and on a newly generated dataset for analysis. To achieve better results, a dual dataset approach has been incorporated to determine the frequency of positive and negative tweets and word clouds, which helps to identify the most effective model for analyzing the corpora. The experimental result shows that the BiLSTM approach outperforms the other approaches with an accuracy of 96.7%.

Semantic Analysis and Topic Modelling of Web-Scrapped COVID-19 Tweet Corpora through Data Mining Methodologies.

作者信息

Gourisaria Mahendra Kumar, Chandra Satish, Das Himansu, Patra Sudhansu Shekhar, Sahni Manoj, Leon-Castro Ernesto, Singh Vijander, Kumar Sandeep

机构信息

School of Computer Engineering, KIIT Deemed to be University, Bhubaneswar 751024, Odisha, India.

School of Computer Applications, KIIT Deemed to be University, Bhubaneswar 751024, Odisha, India.

出版信息

Healthcare (Basel). 2022 May 10;10(5):881. doi: 10.3390/healthcare10050881.

冠状病毒病（COVID-19）的演变给人类的社会、医疗、经济和心理繁荣带来了损失。在过去几个月里，许多组织、个人和政府都利用推特来表达他们对COVID-19、封锁、疫情以及相关话题标签的看法。本文旨在分析推特用户与COVID-19相关的心理反应和话语。在这个实验中，潜在狄利克雷分配（LDA）被用于主题建模。此外，双向长短期记忆（BiLSTM）模型以及各种分类技术，如随机森林、支持向量机、逻辑回归、朴素贝叶斯、决策树、带有随机梯度下降优化器的逻辑回归和多数投票分类器，都被用于分析情感的极性。上述方法与LDA建模的有效性已经在几个基准数据集以及一个新生成的用于分析的数据集上进行了测试、验证和比较。为了取得更好的结果，采用了双数据集方法来确定积极和消极推文的频率以及词云，这有助于识别用于分析语料库的最有效模型。实验结果表明，BiLSTM方法以96.7%的准确率优于其他方法。