Suppr超能文献

孟加拉语手写孤立字符多用途综合数据集:BanglaLekha-Isolated

BanglaLekha-Isolated: A multi-purpose comprehensive dataset of Handwritten Bangla Isolated characters.

作者信息

Biswas Mithun, Islam Rafiqul, Shom Gautam Kumar, Shopon Md, Mohammed Nabeel, Momen Sifat, Abedin Anowarul

机构信息

Department of Computer Science and Engineering, University of Liberal Arts Bangladesh, Bangladesh.

Department of Computer Science and Engineering, University of Asia Pacific, Bangladesh.

出版信息

Data Brief. 2017 Mar 29;12:103-107. doi: 10.1016/j.dib.2017.03.035. eCollection 2017 Jun.

Abstract

BanglaLekha-Isolated, a Bangla handwritten isolated character dataset is presented in this article. This dataset contains 84 different characters comprising of 50 Bangla basic characters, 10 Bangla numerals and 24 selected compound characters. 2000 handwriting samples for each of the 84 characters were collected, digitized and pre-processed. After discarding mistakes and scribbles, 1,66,105 handwritten character images were included in the final dataset. The dataset also includes labels indicating the age and the gender of the subjects from whom the samples were collected. This dataset could be used not only for optical handwriting recognition research but also to explore the influence of gender and age on handwriting. The dataset is publicly available at https://data.mendeley.com/datasets/hf6sf8zrkc/2.

摘要

《孟加拉文字符集 - 孤立字符集》,本文介绍了一个孟加拉文手写孤立字符数据集。该数据集包含84个不同字符,由50个孟加拉文基本字符、10个孟加拉文数字和24个选定的复合字符组成。为这84个字符中的每一个收集了2000个手写样本,进行了数字化处理和预处理。在剔除错误和潦草字迹后,最终数据集中包含166105个手写字符图像。该数据集还包括表明采集样本的受试者年龄和性别的标签。这个数据集不仅可用于光学手写识别研究,还可用于探究性别和年龄对手写的影响。该数据集可在https://data.mendeley.com/datasets/hf6sf8zrkc/2上公开获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/838e/5382023/6d6cb628c5ed/gr1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验