Suppr超能文献

tRF-BERT:一种用于孟加拉语基于方面的情感分析的变革方法。

tRF-BERT: A transformative approach to aspect-based sentiment analysis in the bengali language.

机构信息

Department of Information and Communication Technology, Comilla University, Cumilla, Bangladesh.

Department of Computer Science and Engineering, CCN University of Science & Technology, Cumilla, Bangladesh.

出版信息

PLoS One. 2024 Sep 20;19(9):e0308050. doi: 10.1371/journal.pone.0308050. eCollection 2024.

Abstract

In recent years, the surge in reviews and comments on newspapers and social media has made sentiment analysis a focal point of interest for researchers. Sentiment analysis is also gaining popularity in the Bengali language. However, Aspect-Based Sentiment Analysis is considered a difficult task in the Bengali language due to the shortage of perfectly labeled datasets and the complex variations in the Bengali language. This study used two open-source benchmark datasets of the Bengali language, Cricket, and Restaurant, for our Aspect-Based Sentiment Analysis task. The original work was based on the Random Forest, Support Vector Machine, K-Nearest Neighbors, and Convolutional Neural Network models. In this work, we used the Bidirectional Encoder Representations from Transformers, the Robustly Optimized BERT Approach, and our proposed hybrid transformative Random Forest and Bidirectional Encoder Representations from Transformers (tRF-BERT) models to compare the results with the existing work. After comparing the results, we can clearly see that all the models used in our work achieved better results than any of the previous works on the same dataset. Amongst them, our proposed transformative Random Forest and Bidirectional Encoder Representations from Transformers achieved the highest F1 score and accuracy. The accuracy and F1 score of aspect detection for the Cricket dataset were 0.89 and 0.85, respectively, and for the Restaurant dataset were 0.92 and 0.89 respectively.

摘要

近年来,报纸和社交媒体上评论和评论的激增,使得情感分析成为研究人员关注的焦点。情感分析在孟加拉语中也越来越受欢迎。然而,由于缺乏完美标记的数据集和孟加拉语的复杂变化,基于方面的情感分析被认为是孟加拉语中的一项艰巨任务。本研究使用了两个孟加拉语的开源基准数据集,即板球和餐厅,来进行基于方面的情感分析任务。原始工作基于随机森林、支持向量机、K-最近邻和卷积神经网络模型。在这项工作中,我们使用了来自 Transformer 的双向编码器表示、鲁棒优化的 BERT 方法以及我们提出的混合变换随机森林和来自 Transformer 的双向编码器表示(tRF-BERT)模型来与现有工作进行比较。在比较结果后,我们可以清楚地看到,我们工作中使用的所有模型都比同一数据集上的任何先前工作都取得了更好的结果。在这些模型中,我们提出的变换随机森林和来自 Transformer 的双向编码器表示达到了最高的 F1 分数和准确性。板球数据集的方面检测的准确性和 F1 分数分别为 0.89 和 0.85,餐厅数据集的准确性和 F1 分数分别为 0.92 和 0.89。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/04e1/11414928/5726d6959d97/pone.0308050.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验