Suppr超能文献

社交媒体中关于埃塞俄比亚广播的公众意见挖掘:深度学习方法

Public opinion mining in social media about Ethiopian broadcasts using deep learning.

机构信息

Department of Information Technology, Debre Markos University, Debre Markos, Ethiopia.

Department of Software Engineering, Debre Markos University, Debre Markos, Ethiopia.

出版信息

Sci Rep. 2024 Nov 12;14(1):27676. doi: 10.1038/s41598-024-76542-3.

Abstract

Now adays people express and share their opinions on various events on the internet thanks to social media. Opinion mining is the process of interpreting user-generated opinion data on social media. Aside from its lack of resources in opinion-mining tasks, Amharic presents numerous difficulties because of its complex structure and variety of dialects. Analyzing every comment written in Amharic is a challenging task. Significant advancements in opinion mining have been achieved using deep learning. An opinion-mining model was used in this study to classify user comments written in Amharic as positive or negative. The domains that we focus on in this study are YouTube and Facebook. From the Ethiopian broadcasts YouTube and Facebook official pages, we gathered 11,872 unstructured data for this study using www.exportcomment.com , and Facebook page tools. Text preprocessing and feature extraction techniques were used, in addition to manual annotation by linguistic specialists. The dataset was prepared for the experiment after annotation, preprocessing, and representation. LSTM, GRU, BiGRU, BiLSTM, and a hybrid of CNN with BiLSTM classifiers from the TensorFlow Keras deep learning library were used to train the model using the dataset, which was split using the 80/20 train-test method, which proved effective for classification problems. Finally, we achieved of 94.27%, 95.20%, 95.49%, 95.62%, and 96.08% using GRU, BiGRU, LSTM, BiLSTM, and CNN with BiLSTM, respectively, in word2vec embedding model.

摘要

如今,人们借助社交媒体在互联网上表达和分享对各种事件的看法。观点挖掘是解释社交媒体上用户生成的观点数据的过程。除了在观点挖掘任务中资源不足之外,阿姆哈拉语由于其复杂的结构和多种方言也带来了许多困难。分析用阿姆哈拉语写的每一条评论都是一项具有挑战性的任务。深度学习在观点挖掘方面取得了重大进展。在这项研究中,使用了一种观点挖掘模型来对用阿姆哈拉语写的用户评论进行正面或负面的分类。我们关注的领域是 YouTube 和 Facebook。我们使用 www.exportcomment.com 和 Facebook 页面工具从埃塞俄比亚的 YouTube 和 Facebook 官方页面收集了 11872 条非结构化数据。除了语言专家的手动注释外,还使用了文本预处理和特征提取技术。在注释、预处理和表示之后,为实验准备了数据集。使用来自 TensorFlow Keras 深度学习库的 LSTM、GRU、BiGRU、BiLSTM 和 CNN 与 BiLSTM 分类器,使用数据集进行模型训练,通过 80/20 训练-测试方法对数据集进行分割,该方法对分类问题非常有效。最后,在 word2vec 嵌入模型中,我们分别使用 GRU、BiGRU、LSTM、BiLSTM 和 CNN 与 BiLSTM 实现了 94.27%、95.20%、95.49%、95.62%和 96.08%的准确率。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0d66/11557882/5e989b24b885/41598_2024_76542_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验