Suppr超能文献

HCCD:一个用于在各种退化条件下进行文档增强的手写相机捕获数据集。

HCCD: A handwritten camera-captured dataset for document enhancement under varied degradation conditions.

作者信息

Koushik K S, B J Bipin Nair, Rani N Shobha

机构信息

Department of Computer Science, Amrita School of Computing, Amrita Vishwa Vidyapeetham, Mysuru, India.

Department of Artificial Intelligence and Data Science, GITAM School of Technology, Bengaluru, GITAM (Deemed to be) University, India.

出版信息

Data Brief. 2025 Jul 2;61:111849. doi: 10.1016/j.dib.2025.111849. eCollection 2025 Aug.

Abstract

Enhancing degraded handwritten documents captured with smartphone cameras remains a significant challenge in document analysis. Although deep learning-based enhancement techniques have shown promise, the performance of deep learning models largely relies on the availability of meticulously labeled ground truth datasets. To address this gap, in this study, the Handwritten Camera-Captured Dataset (HCCD) is introduced to support document enhancement and recognition tasks specific to real-world scenarios. Unlike existing datasets, which are captured in controlled environments with scanners or smartphone cameras, HCCD features real-time, camera-captured handwritten documents exhibiting a range of natural degradations. The degradation issues encompass motion blur, shadow artifacts, and uneven lighting, which reflect challenges incurred in the real-life document digitization process. In the proposed dataset, each handwritten document is paired with a high-quality enhanced image created through a combination of computer vision-based imaging techniques. The documents are in Roman script and were contributed by multiple individuals with varying handwriting styles. The dataset is valuable for machine learning/ deep learning-based training for image restoration, denoising, and OCR applications. Each sample is annotated with rich metadata for further targeted research, including degradation type, severity level, and writer-specific demographics.

摘要

增强用智能手机摄像头拍摄的退化手写文档仍然是文档分析中的一项重大挑战。尽管基于深度学习的增强技术已展现出前景,但深度学习模型的性能在很大程度上依赖于精心标注的地面真值数据集的可用性。为了弥补这一差距,在本研究中,引入了手写相机拍摄数据集(HCCD)来支持特定于现实世界场景的文档增强和识别任务。与现有的在受控环境中使用扫描仪或智能手机摄像头捕获的数据集不同,HCCD的特点是实时、由相机拍摄的手写文档,呈现出一系列自然退化情况。退化问题包括运动模糊、阴影伪影和光照不均,这些反映了现实生活中文档数字化过程中遇到的挑战。在所提出的数据集中,每个手写文档都与通过基于计算机视觉的成像技术组合创建的高质量增强图像配对。文档采用罗马字母书写,由多个具有不同书写风格的个人提供。该数据集对于基于机器学习/深度学习的图像恢复、去噪和光学字符识别(OCR)应用训练很有价值。每个样本都带有丰富的元数据,用于进一步的针对性研究,包括退化类型、严重程度级别和作者特定的人口统计学信息。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b882/12281058/2bd005ac0e99/gr1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验