Aurpa Tanjim Taharat, Fariha Kazi Noshin, Hossain Kawser
Department of Data Science, Bangabandhu Sheikh Mujibur Rahman Digital University, Bangladesh.
International University of Business Agriculture and Technology, Bangladesh.
Data Brief. 2024 Jul 17;55:110742. doi: 10.1016/j.dib.2024.110742. eCollection 2024 Aug.
Equation Recognition is a mathematical task of identifying equations, which has significance in developing different mathematical systems. In this paper, we introduce a novel Bangla mathematical equation dataset comprising 3430 observations aimed at advancing mathematical Equation Recognition in the Bangla language. To the best of our knowledge, no such dataset exists that was developed to recognize equations from the text. Each entry in the dataset includes a mathematical statement and the corresponding equation. This resource can significantly support research in mathematical Equation Recognition, including the identification of common mathematical operations (such as addition, subtraction, multiplication, division, and roots) and numerical values. With minor adjustments, researchers can also explore combinations of these findings. The dataset is raw and conveniently structured in CSV format, with two columns: "Text" and "Equation," facilitating easy handling for various deep learning and machine learning tasks.
方程识别是一项识别方程的数学任务,在开发不同的数学系统中具有重要意义。在本文中,我们引入了一个新颖的孟加拉语数学方程数据集,该数据集包含3430个观测值,旨在推进孟加拉语的数学方程识别。据我们所知,不存在专门为从文本中识别方程而开发的此类数据集。数据集中的每个条目都包括一个数学陈述和相应的方程。该资源可以显著支持数学方程识别方面的研究,包括识别常见的数学运算(如加法、减法、乘法、除法和开方)以及数值。稍作调整后,研究人员还可以探索这些发现的组合。该数据集是原始的,以CSV格式方便地结构化,有两列:“文本”和“方程”,便于各种深度学习和机器学习任务的轻松处理。