Department of Information and Communication Technology, Comilla University, Cumilla, Bangladesh.
Department of Computer Science and Engineering, CCN University of Science & Technology, Cumilla, Bangladesh.
PLoS One. 2024 Sep 20;19(9):e0308050. doi: 10.1371/journal.pone.0308050. eCollection 2024.
In recent years, the surge in reviews and comments on newspapers and social media has made sentiment analysis a focal point of interest for researchers. Sentiment analysis is also gaining popularity in the Bengali language. However, Aspect-Based Sentiment Analysis is considered a difficult task in the Bengali language due to the shortage of perfectly labeled datasets and the complex variations in the Bengali language. This study used two open-source benchmark datasets of the Bengali language, Cricket, and Restaurant, for our Aspect-Based Sentiment Analysis task. The original work was based on the Random Forest, Support Vector Machine, K-Nearest Neighbors, and Convolutional Neural Network models. In this work, we used the Bidirectional Encoder Representations from Transformers, the Robustly Optimized BERT Approach, and our proposed hybrid transformative Random Forest and Bidirectional Encoder Representations from Transformers (tRF-BERT) models to compare the results with the existing work. After comparing the results, we can clearly see that all the models used in our work achieved better results than any of the previous works on the same dataset. Amongst them, our proposed transformative Random Forest and Bidirectional Encoder Representations from Transformers achieved the highest F1 score and accuracy. The accuracy and F1 score of aspect detection for the Cricket dataset were 0.89 and 0.85, respectively, and for the Restaurant dataset were 0.92 and 0.89 respectively.
近年来,报纸和社交媒体上评论和评论的激增,使得情感分析成为研究人员关注的焦点。情感分析在孟加拉语中也越来越受欢迎。然而,由于缺乏完美标记的数据集和孟加拉语的复杂变化,基于方面的情感分析被认为是孟加拉语中的一项艰巨任务。本研究使用了两个孟加拉语的开源基准数据集,即板球和餐厅,来进行基于方面的情感分析任务。原始工作基于随机森林、支持向量机、K-最近邻和卷积神经网络模型。在这项工作中,我们使用了来自 Transformer 的双向编码器表示、鲁棒优化的 BERT 方法以及我们提出的混合变换随机森林和来自 Transformer 的双向编码器表示(tRF-BERT)模型来与现有工作进行比较。在比较结果后,我们可以清楚地看到,我们工作中使用的所有模型都比同一数据集上的任何先前工作都取得了更好的结果。在这些模型中,我们提出的变换随机森林和来自 Transformer 的双向编码器表示达到了最高的 F1 分数和准确性。板球数据集的方面检测的准确性和 F1 分数分别为 0.89 和 0.85,餐厅数据集的准确性和 F1 分数分别为 0.92 和 0.89。