• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

A vast dataset for Kurdish handwritten digits and isolated characters recognition.

作者信息

Abdalla Peshraw Ahmed, Qadir Abdalbasit Mohammed, Shakor Mohammed Y, Saeed Ari M, Jabar Abdalla Taha, Salam Ali Abdalla, Amin Hedi Hamid Hama

机构信息

Department of Computer Science, College of Science, University of Halabja, Halabja, Iraq.

Department of Computer Science, College of Science and Technology, University of Human Development, Sulaimaniyah, Iraq.

出版信息

Data Brief. 2023 Mar 2;47:109014. doi: 10.1016/j.dib.2023.109014. eCollection 2023 Apr.

DOI:10.1016/j.dib.2023.109014
PMID:36936638
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10018436/
Abstract

This article presents two massive datasets for central Kurdish handwriting digits and isolated characters named and . The first dataset, named dataset, contains 70,000 images of Kurdish digits, 7000 images for each digit, and a printed A4 paper with a grid of 10 × 10 is used for data collection. Apart from digits, the dataset includes 245,000 images of all Kurdish characters, 7000 images for each character; data was collected via a printed A4 paper with a grid of 12 × 10 for this dataset. Moreover, both datasets include 315,000 images. Python programming has been used to scan each piece of paper, segment, crop, resize, binarize, and invert the images via edge detection and image processing techniques.

摘要
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a764/10018436/2bd0ce907cdc/gr7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a764/10018436/aaf682d2f78e/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a764/10018436/88abb485021e/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a764/10018436/5d8dd07a609d/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a764/10018436/75e9cecf9706/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a764/10018436/a6cf19fc9a76/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a764/10018436/a644139accf5/gr6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a764/10018436/2bd0ce907cdc/gr7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a764/10018436/aaf682d2f78e/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a764/10018436/88abb485021e/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a764/10018436/5d8dd07a609d/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a764/10018436/75e9cecf9706/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a764/10018436/a6cf19fc9a76/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a764/10018436/a644139accf5/gr6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a764/10018436/2bd0ce907cdc/gr7.jpg

相似文献

1
A vast dataset for Kurdish handwritten digits and isolated characters recognition.
Data Brief. 2023 Mar 2;47:109014. doi: 10.1016/j.dib.2023.109014. eCollection 2023 Apr.
2
An extensive dataset of handwritten central Kurdish isolated characters.一个包含大量库尔德语手写孤立字符的数据集。
Data Brief. 2021 Oct 14;39:107479. doi: 10.1016/j.dib.2021.107479. eCollection 2021 Dec.
3
Kurdish standard EMNIST-like character dataset.库尔德标准类EMNIST字符数据集。
Data Brief. 2024 Jan 9;52:110038. doi: 10.1016/j.dib.2024.110038. eCollection 2024 Feb.
4
Kurdish Handwritten character recognition using deep learning techniques.基于深度学习技术的库尔德手写字符识别。
Gene Expr Patterns. 2022 Dec;46:119278. doi: 10.1016/j.gep.2022.119278. Epub 2022 Oct 3.
5
iVision HHID: Handwritten hyperspectral images dataset for benchmarking hyperspectral imaging-based document forensic analysis.iVision HHID:用于基于高光谱成像的文件司法鉴定分析基准测试的手写高光谱图像数据集。
Data Brief. 2022 Feb 16;41:107964. doi: 10.1016/j.dib.2022.107964. eCollection 2022 Apr.
6
BanglaLekha-Isolated: A multi-purpose comprehensive dataset of Handwritten Bangla Isolated characters.孟加拉语手写孤立字符多用途综合数据集:BanglaLekha-Isolated
Data Brief. 2017 Mar 29;12:103-107. doi: 10.1016/j.dib.2017.03.035. eCollection 2017 Jun.
7
BanglaWriting: A multi-purpose offline Bangla handwriting dataset.
Data Brief. 2020 Dec 9;34:106633. doi: 10.1016/j.dib.2020.106633. eCollection 2021 Feb.
8
GHCR-A dataset for Grantha handwritten character recognition.用于格兰塔手写字符识别的GHCR-A数据集。
Data Brief. 2024 Aug 6;56:110783. doi: 10.1016/j.dib.2024.110783. eCollection 2024 Oct.
9
Multilingual character recognition dataset for Moroccan official documents.摩洛哥官方文件的多语言字符识别数据集。
Data Brief. 2023 Dec 13;52:109953. doi: 10.1016/j.dib.2023.109953. eCollection 2024 Feb.
10
DeepLontar dataset for handwritten Balinese character detection and syllable recognition on Lontar manuscript.用于在 lontar 手稿上手写巴厘文字符检测和音节识别的 DeepLontar 数据集。
Sci Data. 2022 Dec 10;9(1):761. doi: 10.1038/s41597-022-01867-5.

引用本文的文献

1
KSTRV1: A scene text recognition dataset for central Kurdish in (Arabic-Based) script.KSTRV1:一个用于(基于阿拉伯文的)库尔德语中部方言的场景文本识别数据集。
Data Brief. 2025 May 14;60:111648. doi: 10.1016/j.dib.2025.111648. eCollection 2025 Jun.
2
Building a benchmark dataset for the Kurdish news question answering.构建库尔德语新闻问答的基准数据集。
Data Brief. 2024 Sep 6;57:110916. doi: 10.1016/j.dib.2024.110916. eCollection 2024 Dec.
3
Dataset for the recognition of Kurdish sound dialects.库尔德语音方言识别数据集。

本文引用的文献

1
An extensive dataset of handwritten central Kurdish isolated characters.一个包含大量库尔德语手写孤立字符的数据集。
Data Brief. 2021 Oct 14;39:107479. doi: 10.1016/j.dib.2021.107479. eCollection 2021 Dec.
Data Brief. 2024 Feb 22;53:110231. doi: 10.1016/j.dib.2024.110231. eCollection 2024 Apr.
4
Kurdish News Dataset Headlines (KNDH) through multiclass classification.库尔德语新闻数据集标题(KNDH)通过多类别分类。
Data Brief. 2023 Apr 13;48:109120. doi: 10.1016/j.dib.2023.109120. eCollection 2023 Jun.