Suppr超能文献

孟加拉语文本的基于方面的情感分析数据集。

Aspect based sentiment analysis datasets for Bangla text.

作者信息

Hasan Mahmudul, Ghani Md Rashedul, Hasan K M Azharul

机构信息

Department of Computer Science and Engineering, Khulna University of Engineering & Technology, Khulna 9203, Bangladesh.

Department of Computer Science and Engineering, IUBAT- International University of Business Agriculture and Technology, 4 Embankment Drive Road, Sector-10, Uttara Model Town, Dhaka 1230, Bangladesh.

出版信息

Data Brief. 2024 Nov 2;57:111107. doi: 10.1016/j.dib.2024.111107. eCollection 2024 Dec.

Abstract

Sentiment analysis is becoming rapidly important for exploring social media Bangla text. The lack of sufficient resources is considered to be an important challenge for Aspect Based Sentiment Analysis (ABSA) of the Bangla language. The ABSA is a technique that divides the text and defines its sentiment based on its aspects. In this paper, we developed a high-quality Bangla ABSA annotated dataset namely BANGLA_ABSA. The datasets are labelled with aspects category and their respective sentiment polarity to do the ABSA task in Bangla. Four open domains namely Restaurant, Movie, Mobile phone, and Car are considered to make the dataset. The datasets are called Mobile_phone_ABSA, and respectively that contain 801, 800, 975, and 1149 comments. All the comments are either complex or compound sentences. We created the dataset manually and annotated the same by exerting opinions. We organized the dataset as three tuples in Excel format namely 〈. These data are very important that facilitate the efficient handling of sentiment for any machine learning and deep learning research, especially for Bangla text.

摘要

情感分析对于探索孟加拉语文本的社交媒体正变得越来越重要。资源不足被认为是孟加拉语基于方面的情感分析(ABSA)的一个重要挑战。ABSA是一种将文本进行划分并根据其方面定义情感的技术。在本文中,我们开发了一个高质量的孟加拉语ABSA注释数据集,即BANGLA_ABSA。这些数据集用方面类别及其各自的情感极性进行标注,以便在孟加拉语中执行ABSA任务。为了创建数据集,我们考虑了四个开放领域,即餐厅、电影、手机和汽车。这些数据集分别称为Mobile_phone_ABSA等,包含801条、800条、975条和1149条评论。所有评论均为复杂句或复合句。我们手动创建了数据集,并通过发表意见对其进行注释。我们将数据集整理为Excel格式的三个元组,即〈 。这些数据对于任何机器学习和深度学习研究,尤其是对于孟加拉语文本,有效处理情感非常重要。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2d98/11617299/f9e7442b7dbc/gr1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验