Biswas Mithun, Islam Rafiqul, Shom Gautam Kumar, Shopon Md, Mohammed Nabeel, Momen Sifat, Abedin Anowarul
Department of Computer Science and Engineering, University of Liberal Arts Bangladesh, Bangladesh.
Department of Computer Science and Engineering, University of Asia Pacific, Bangladesh.
Data Brief. 2017 Mar 29;12:103-107. doi: 10.1016/j.dib.2017.03.035. eCollection 2017 Jun.
BanglaLekha-Isolated, a Bangla handwritten isolated character dataset is presented in this article. This dataset contains 84 different characters comprising of 50 Bangla basic characters, 10 Bangla numerals and 24 selected compound characters. 2000 handwriting samples for each of the 84 characters were collected, digitized and pre-processed. After discarding mistakes and scribbles, 1,66,105 handwritten character images were included in the final dataset. The dataset also includes labels indicating the age and the gender of the subjects from whom the samples were collected. This dataset could be used not only for optical handwriting recognition research but also to explore the influence of gender and age on handwriting. The dataset is publicly available at https://data.mendeley.com/datasets/hf6sf8zrkc/2.
《孟加拉文字符集 - 孤立字符集》,本文介绍了一个孟加拉文手写孤立字符数据集。该数据集包含84个不同字符,由50个孟加拉文基本字符、10个孟加拉文数字和24个选定的复合字符组成。为这84个字符中的每一个收集了2000个手写样本,进行了数字化处理和预处理。在剔除错误和潦草字迹后,最终数据集中包含166105个手写字符图像。该数据集还包括表明采集样本的受试者年龄和性别的标签。这个数据集不仅可用于光学手写识别研究,还可用于探究性别和年龄对手写的影响。该数据集可在https://data.mendeley.com/datasets/hf6sf8zrkc/2上公开获取。