Suppr超能文献

使用机器学习和深度学习技术的实时推特垃圾信息检测与情感分析

Real-Time Twitter Spam Detection and Sentiment Analysis using Machine Learning and Deep Learning Techniques.

作者信息

Rodrigues Anisha P, Fernandes Roshan, A Aakash, B Abhishek, Shetty Adarsh, K Atul, Lakshmanna Kuruva, Shafi R Mahammad

机构信息

Department of Computer Science and Engineering, NMAM Institute of Technology, Nitte, Karkala, India.

SITE, Vellore Institute of Technology, Vellore, Tamilnadu, India.

出版信息

Comput Intell Neurosci. 2022 Apr 15;2022:5211949. doi: 10.1155/2022/5211949. eCollection 2022.

Abstract

In this modern world, we are accustomed to a constant stream of data. Major social media sites like Twitter, Facebook, or Quora face a huge dilemma as a lot of these sites fall victim to spam accounts. These accounts are made to trap unsuspecting genuine users by making them click on malicious links or keep posting redundant posts by using bots. This can greatly impact the experiences that users have on these sites. A lot of time and research has gone into effective ways to detect these forms of spam. Performing sentiment analysis on these posts can help us in solving this problem effectively. The main purpose of this proposed work is to develop a system that can determine whether a tweet is "spam" or "ham" and evaluate the emotion of the tweet. The extracted features after preprocessing the tweets are classified using various classifiers, namely, decision tree, logistic regression, multinomial naïve Bayes, support vector machine, random forest, and Bernoulli naïve Bayes for spam detection. The stochastic gradient descent, support vector machine, logistic regression, random forest, naïve Bayes, and deep learning methods, namely, simple recurrent neural network (RNN) model, long short-term memory (LSTM) model, bidirectional long short-term memory (BiLSTM) model, and 1D convolutional neural network (CNN) model are used for sentiment analysis. The performance of each classifier is analyzed. The classification results showed that the features extracted from the tweets can be satisfactorily used to identify if a certain tweet is spam or not and create a learning model that will associate tweets with a particular sentiment.

摘要

在这个现代世界中,我们习惯了源源不断的数据。像推特、脸书或知乎这样的主流社交媒体网站面临着巨大的困境,因为许多这类网站都沦为了垃圾账户的受害者。这些账户的目的是诱骗毫无戒心的真实用户,让他们点击恶意链接,或者通过使用机器人不断发布冗余帖子。这会极大地影响用户在这些网站上的体验。人们已经投入了大量时间和研究来寻找检测这些垃圾信息形式的有效方法。对这些帖子进行情感分析有助于我们有效解决这个问题。这项拟议工作的主要目的是开发一个系统,该系统可以确定一条推文是“垃圾信息”还是“正常信息”,并评估推文的情感。对推文进行预处理后提取的特征使用各种分类器进行分类,即决策树、逻辑回归、多项式朴素贝叶斯、支持向量机、随机森林和伯努利朴素贝叶斯,用于垃圾信息检测。随机梯度下降、支持向量机、逻辑回归、随机森林、朴素贝叶斯以及深度学习方法,即简单循环神经网络(RNN)模型、长短期记忆(LSTM)模型、双向长短期记忆(BiLSTM)模型和一维卷积神经网络(CNN)模型,用于情感分析。分析了每个分类器的性能。分类结果表明,从推文中提取的特征可以令人满意地用于识别某条推文是否为垃圾信息,并创建一个将推文与特定情感相关联的学习模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c06a/9033328/1dd3db4f6136/CIN2022-5211949.001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验