Suppr超能文献

一种使用基于梯度优化的卷积神经网络与BERT嵌入的高效灾难推文分类方法。

An efficient method for disaster tweets classification using gradient-based optimized convolutional neural networks with BERT embeddings.

作者信息

Dharrao Deepak, Mr Aadithyanarayanan, Mital Rewaa, Vengali Abhinav, Pangavhane Madhuri, Rajput Satpalsing, Bongale Anupkumar M

机构信息

Department of Computer Science and Engineering, Symbiosis Institute of Technology, Pune Campus, Symbiosis International (Deemed University), Pune, India.

Department of Computer Engineering, Vishwakarma Institute of Technology, Pune, India.

出版信息

MethodsX. 2024 Jul 3;13:102843. doi: 10.1016/j.mex.2024.102843. eCollection 2024 Dec.

Abstract

Event of the disastrous scenarios are actively discussed on microblogging platforms like Twitter which can lead to chaotic situations. In the era of machine learning and deep learning, these chaotic situations can be effectively controlled by developing efficient methods and models that can assist in classifying real and fake tweets. In this research article, an efficient method named BERT Embedding based CNN model with RMSProp Optimizer is proposed to effectively classify the tweets related disastrous scenario. Tweet classification is carried out via some of the popular the machine learning algorithms such as logistic regression and decision tree classifiers. Noting the low accuracy of machine learning models, Convolutional Neural Network (CNN) based deep learning model is selected as the primary classification method. CNNs performance is improved via optimization of the parameters with gradient based optimizers. To further elevate accuracy and to capture contextual semantics from the text data, BERT embeddings are included in the proposed model. The performance of proposed method - BERT Embedding based CNN model with RMSProp Optimizer achieved an F1 score of 0.80 and an Accuracy of 0.83. The methodology presented in this research article is comprised of the following key contributions:•Identification of suitable text classification model that can effectively capture complex patterns when dealing with large vocabularies or nuanced language structures in disaster management scenarios.•The method explores the gradient based optimization techniques such as Adam Optimizer, Stochastic Gradient Descent (SGD) Optimizer, AdaGrad, and RMSprop Optimizer to identify the most appropriate optimizer that meets the characteristics of the dataset and the CNN model architecture.•"BERT Embedding based CNN model with RMSProp Optimizer" - a method to classify the disaster tweets and capture semantic representations by leveraging BERT embeddings with appropriate feature selection is presented and models are validated with appropriate comparative analysis.

摘要

在推特等微博平台上,人们积极讨论灾难性场景事件,这可能会导致混乱局面。在机器学习和深度学习时代,可以通过开发有效的方法和模型来有效控制这些混乱局面,这些方法和模型有助于对真假推文进行分类。在这篇研究文章中,提出了一种名为基于BERT嵌入的带有RMSProp优化器的CNN模型的有效方法,以有效分类与灾难性场景相关的推文。推文分类是通过一些流行的机器学习算法进行的,如逻辑回归和决策树分类器。鉴于机器学习模型的准确率较低,选择基于卷积神经网络(CNN)的深度学习模型作为主要分类方法。通过基于梯度的优化器对参数进行优化,提高了CNN的性能。为了进一步提高准确率并从文本数据中捕捉上下文语义,在所提出的模型中纳入了BERT嵌入。所提出的方法——基于BERT嵌入的带有RMSProp优化器的CNN模型,F1分数达到0.80,准确率达到0.83。这篇研究文章中提出的方法包括以下关键贡献:

•确定合适的文本分类模型,该模型在处理灾害管理场景中的大词汇量或细微语言结构时能够有效捕捉复杂模式。

•该方法探索了基于梯度的优化技术,如Adam优化器、随机梯度下降(SGD)优化器、AdaGrad和RMSprop优化器,以确定最适合数据集特征和CNN模型架构的优化器。

•提出了“基于BERT嵌入的带有RMSProp优化器的CNN模型”,这是一种通过利用BERT嵌入并进行适当的特征选择来对灾害推文进行分类并捕捉语义表示的方法,并且通过适当的对比分析对模型进行了验证。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9bbb/11296064/05259cc79011/ga1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验