Suppr超能文献

退化历史文档二值化:关于问题、挑战、技术及未来方向的综述

Degraded Historical Document Binarization: A Review on Issues, Challenges, Techniques, and Future Directions.

作者信息

Sulaiman Alaa, Omar Khairuddin, Nasrudin Mohammad F

机构信息

Pattern Recognition Research Group, Centre for Artificial Intelligence Technology, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia (UKM), Bangi 43600, Selangor, Malaysia.

出版信息

J Imaging. 2019 Apr 12;5(4):48. doi: 10.3390/jimaging5040048.

Abstract

In this era of digitization, most hardcopy documents are being transformed into digital formats. In the process of transformation, large quantities of documents are stored and preserved through electronic scanning. These documents are available from various sources such as ancient documentation, old legal records, medical reports, music scores, palm leaf, and reports on security-related issues. In particular, ancient and historical documents are hard to read due to their degradation in terms of low contrast and existence of corrupted artefacts. In recent times, degraded document binarization has been studied widely and several approaches were developed to deal with issues and challenges in document binarization. In this paper, a comprehensive review is conducted on the issues and challenges faced during the image binarization process, followed by insights on various methods used for image binarization. This paper also discusses the advanced methods used for the enhancement of degraded documents that improves the quality of documents during the binarization process. Further discussions are made on the effectiveness and robustness of existing methods, and there is still a scope to develop a hybrid approach that can deal with degraded document binarization more effectively.

摘要

在这个数字化时代,大多数纸质文档正在被转换成数字格式。在转换过程中,大量文档通过电子扫描进行存储和保存。这些文档来源广泛,如古代文献、旧法律记录、医疗报告、乐谱、棕榈叶以及与安全相关问题的报告等。特别是古代和历史文档,由于对比度低和存在损坏的伪像而难以阅读。近年来,退化文档二值化受到了广泛研究,并开发了几种方法来处理文档二值化中的问题和挑战。本文对图像二值化过程中面临的问题和挑战进行了全面综述,随后对用于图像二值化的各种方法进行了深入分析。本文还讨论了用于增强退化文档的先进方法,这些方法在二值化过程中提高了文档质量。进一步讨论了现有方法的有效性和鲁棒性,并且仍有开发一种能更有效处理退化文档二值化的混合方法的空间。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1c8e/8320943/8467560e23c1/jimaging-05-00048-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验