Suppr超能文献

HUTNet:一种用于手写藏文 Uchen 字符识别的高效卷积神经网络。

HUTNet: An Efficient Convolutional Neural Network for Handwritten Uchen Tibetan Character Recognition.

机构信息

Key Laboratory of China's Ethnic Languages and Information Technology of Ministry of Education, Northwest Minzu University, Lanzhou, China.

出版信息

Big Data. 2023 Oct;11(5):387-398. doi: 10.1089/big.2021.0333. Epub 2023 Jan 19.

Abstract

Recognition of handwritten Uchen Tibetan characters input has been considered an efficient way of acquiring mass data in the digital era. However, it still faces considerable challenges due to seriously touching letters and various morphological features of identical characters. Thus, deeper neural networks are required to achieve decent recognition accuracy, making an efficient, lightweight model design important to balance the inevitable trade-off between accuracy and latency. To reduce the learnable parameters of the network as much as possible and maintain acceptable accuracy, we introduce an efficient model named HUTNet based on the internal relationship between floating-point operations per second (FLOPs) and Memory Access Cost. The proposed network achieves a ResNet-18-level accuracy of 96.86%, with only a tenth of the parameters. The subsequent pruning and knowledge distillation strategies were applied to further reduce the inference latency of the model. Experiments on the test set (Handwritten Uchen Tibetan Data set by Wang [HUTDW]) containing 562 classes of 42,068 samples show that the compressed model achieves a 96.83% accuracy while maintaining lower FLOPs and fewer parameters. To verify the effectiveness of HUTNet, we tested it on the Chinese Handwriting Data sets Handwriting Database 1.1 (HWDB1.1), in which HUTNet achieved an accuracy of 97.24%, higher than that of ResNet-18 and ResNet-34. In general, we conduct extensive experiments on resource and accuracy trade-offs and show a stronger performance compared with other famous models on HUTDW and HWDB1.1. It also unlocks the critical bottleneck for handwritten Uchen Tibetan recognition on low-power computing devices.

摘要

手写 Uchen 藏文字符的识别已被认为是在数字时代获取大量数据的有效方法。然而,由于严重的触笔和相同字符的各种形态特征,它仍然面临着相当大的挑战。因此,需要更深层次的神经网络来实现令人满意的识别精度,这使得高效、轻量级的模型设计对于平衡准确性和延迟之间不可避免的权衡至关重要。为了尽可能减少网络的可学习参数并保持可接受的准确性,我们引入了一种名为 HUTNet 的高效模型,该模型基于每秒浮点运算(FLOPs)和内存访问成本之间的内在关系。所提出的网络实现了 ResNet-18 级别的 96.86%的准确性,但其参数仅为十分之一。随后应用剪枝和知识蒸馏策略进一步降低了模型的推理延迟。在包含 42068 个样本的 562 类测试集(Wang 的手写 Uchen 藏文数据集[HUTDW])上的实验表明,压缩模型在保持较低 FLOPs 和较少参数的同时,仍能达到 96.83%的准确性。为了验证 HUTNet 的有效性,我们在中文手写数据集 Handwriting Database 1.1 (HWDB1.1)上对其进行了测试,在该数据集上,HUTNet 的准确率达到了 97.24%,高于 ResNet-18 和 ResNet-34。总的来说,我们在资源和准确性权衡方面进行了广泛的实验,并在 HUTDW 和 HWDB1.1 上与其他著名模型相比表现出了更强的性能。它还为低功耗计算设备上的手写 Uchen 藏文识别解锁了关键瓶颈。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验