• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

孟加拉语手写孤立字符多用途综合数据集:BanglaLekha-Isolated

BanglaLekha-Isolated: A multi-purpose comprehensive dataset of Handwritten Bangla Isolated characters.

作者信息

Biswas Mithun, Islam Rafiqul, Shom Gautam Kumar, Shopon Md, Mohammed Nabeel, Momen Sifat, Abedin Anowarul

机构信息

Department of Computer Science and Engineering, University of Liberal Arts Bangladesh, Bangladesh.

Department of Computer Science and Engineering, University of Asia Pacific, Bangladesh.

出版信息

Data Brief. 2017 Mar 29;12:103-107. doi: 10.1016/j.dib.2017.03.035. eCollection 2017 Jun.

DOI:10.1016/j.dib.2017.03.035
PMID:28409178
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5382023/
Abstract

BanglaLekha-Isolated, a Bangla handwritten isolated character dataset is presented in this article. This dataset contains 84 different characters comprising of 50 Bangla basic characters, 10 Bangla numerals and 24 selected compound characters. 2000 handwriting samples for each of the 84 characters were collected, digitized and pre-processed. After discarding mistakes and scribbles, 1,66,105 handwritten character images were included in the final dataset. The dataset also includes labels indicating the age and the gender of the subjects from whom the samples were collected. This dataset could be used not only for optical handwriting recognition research but also to explore the influence of gender and age on handwriting. The dataset is publicly available at https://data.mendeley.com/datasets/hf6sf8zrkc/2.

摘要

《孟加拉文字符集 - 孤立字符集》,本文介绍了一个孟加拉文手写孤立字符数据集。该数据集包含84个不同字符,由50个孟加拉文基本字符、10个孟加拉文数字和24个选定的复合字符组成。为这84个字符中的每一个收集了2000个手写样本,进行了数字化处理和预处理。在剔除错误和潦草字迹后,最终数据集中包含166105个手写字符图像。该数据集还包括表明采集样本的受试者年龄和性别的标签。这个数据集不仅可用于光学手写识别研究,还可用于探究性别和年龄对手写的影响。该数据集可在https://data.mendeley.com/datasets/hf6sf8zrkc/2上公开获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/838e/5382023/c1c8340e1ee5/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/838e/5382023/6d6cb628c5ed/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/838e/5382023/732689231aa9/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/838e/5382023/c1c8340e1ee5/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/838e/5382023/6d6cb628c5ed/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/838e/5382023/732689231aa9/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/838e/5382023/c1c8340e1ee5/gr3.jpg

相似文献

1
BanglaLekha-Isolated: A multi-purpose comprehensive dataset of Handwritten Bangla Isolated characters.孟加拉语手写孤立字符多用途综合数据集:BanglaLekha-Isolated
Data Brief. 2017 Mar 29;12:103-107. doi: 10.1016/j.dib.2017.03.035. eCollection 2017 Jun.
2
A multi-purpose dataset of Devanagari script comprising of isolated numerals and vowels.一个包含孤立数字和元音的梵文字母多用途数据集。
Data Brief. 2021 Dec 16;40:107723. doi: 10.1016/j.dib.2021.107723. eCollection 2022 Feb.
3
BanglaWriting: A multi-purpose offline Bangla handwriting dataset.
Data Brief. 2020 Dec 9;34:106633. doi: 10.1016/j.dib.2020.106633. eCollection 2021 Feb.
4
Convolutional neural network-based ensemble methods to recognize Bangla handwritten character.基于卷积神经网络的集成方法用于识别孟加拉语手写字符。
PeerJ Comput Sci. 2021 Jun 28;7:e565. doi: 10.7717/peerj-cs.565. eCollection 2021.
5
CBD2023: A Hypercomplex Bangla Handwriting Character Recognition Data for Hierarchical Class Expansion.CBD2023:用于分层类别扩展的超复杂孟加拉语手写字符识别数据
Data Brief. 2023 Dec 8;52:109909. doi: 10.1016/j.dib.2023.109909. eCollection 2024 Feb.
6
Arabic handwritten alphabets, words and paragraphs per user (AHAWP) dataset.每位用户的阿拉伯文手写字母、单词和段落(AHAWP)数据集
Data Brief. 2022 Feb 13;41:107947. doi: 10.1016/j.dib.2022.107947. eCollection 2022 Apr.
7
PHND: Pashtu Handwritten Numerals Database and deep learning benchmark.PHND:普什图语手写数字数据库和深度学习基准。
PLoS One. 2020 Sep 2;15(9):e0238423. doi: 10.1371/journal.pone.0238423. eCollection 2020.
8
A vast dataset for Kurdish handwritten digits and isolated characters recognition.
Data Brief. 2023 Mar 2;47:109014. doi: 10.1016/j.dib.2023.109014. eCollection 2023 Apr.
9
GHCR-A dataset for Grantha handwritten character recognition.用于格兰塔手写字符识别的GHCR-A数据集。
Data Brief. 2024 Aug 6;56:110783. doi: 10.1016/j.dib.2024.110783. eCollection 2024 Oct.
10
iVision HHID: Handwritten hyperspectral images dataset for benchmarking hyperspectral imaging-based document forensic analysis.iVision HHID:用于基于高光谱成像的文件司法鉴定分析基准测试的手写高光谱图像数据集。
Data Brief. 2022 Feb 16;41:107964. doi: 10.1016/j.dib.2022.107964. eCollection 2022 Apr.

引用本文的文献

1
An online cursive handwritten medical words recognition system for busy doctors in developing countries for ensuring efficient healthcare service delivery.面向发展中国家忙碌医生的在线手写医学单词识别系统,确保高效的医疗保健服务提供。
Sci Rep. 2022 Mar 4;12(1):3601. doi: 10.1038/s41598-022-07571-z.
2
Convolutional neural network-based ensemble methods to recognize Bangla handwritten character.基于卷积神经网络的集成方法用于识别孟加拉语手写字符。
PeerJ Comput Sci. 2021 Jun 28;7:e565. doi: 10.7717/peerj-cs.565. eCollection 2021.
3
BanglaWriting: A multi-purpose offline Bangla handwriting dataset.

本文引用的文献

1
Deep learning.深度学习。
Nature. 2015 May 28;521(7553):436-44. doi: 10.1038/nature14539.
2
Handwritten numeral databases of Indian scripts and multistage recognition of mixed numerals.印度文字手写数字数据库及混合数字的多阶段识别
IEEE Trans Pattern Anal Mach Intell. 2009 Mar;31(3):444-57. doi: 10.1109/TPAMI.2008.88.
Data Brief. 2020 Dec 9;34:106633. doi: 10.1016/j.dib.2020.106633. eCollection 2021 Feb.
4
Personal name in Igbo Culture: A dataset on randomly selected personal names and their statistical analysis.伊博文化中的人名:一个关于随机选取的人名及其统计分析的数据集。
Data Brief. 2017 Sep 1;15:72-80. doi: 10.1016/j.dib.2017.08.045. eCollection 2017 Dec.