文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

基于深度学习和粒子群优化的视障人士文本识别与鉴定技术

Deep Learning and Particle Swarm Optimisation-Based Techniques for Visually Impaired Humans' Text Recognition and Identification.

作者信息

Pandey Binay Kumar, Pandey Digvijay, Wariya Subodh, Aggarwal Gaurav, Rastogi Rahul

机构信息

Department of Information Technology, College of Technology, Govind Ballabh Pant University of Agriculture and Technology, Pantnagar, Uttarakhand India.

Department of Computer Science and Engineering, Invertis University, Bareilly, India.

出版信息

Augment Hum Res. 2021;6(1):14. doi: 10.1007/s41133-021-00051-5. Epub 2021 Oct 29.


DOI:10.1007/s41133-021-00051-5
PMID:40477829
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8553597/
Abstract

Blind people can benefit greatly from a system capable of localising and reading comprehension text embedded in natural scenes and providing useful information that boosts their self-esteem and autonomy in everyday situations. Regardless of the fact that existing optical character recognition programmes seem to be quick and effective, the majority of them are not able to correctly recognise text embedded in usual panorama images. The methodology described in this paper is to localise textual image regions and pre-process them using the naïve Bayesian algorithm. A weighted reading technique is used to generate the correct text data from the complicated image regions. Usually, images hold some disturbance as a result of the fact that filtration is proposed during the early pre-processing step. To restore the image's quality, the input image is processed employing gradient and contrast image methods. Following that, the contrast of the source images would be enhanced using an adaptive image map. The stroke width transform, Gabor's transform, and weighted naïve Bayesian classifier methodologies have been used in complicated degraded images to segment, feature extraction, and detect textual and non-textual elements. Finally, to identify categorised textual data, the confluence of deep neural networks and particle swarm optimisation is being used. The text in the image is transformed into an acoustic output after identification. The dataset IIIT5K is used for the development portion, and the performance of the suggested come up is evaluated using parameters such as accuracy, recall, precision, and F1-score.

摘要

盲人能够从一个能够定位并理解嵌入自然场景中的文本、并提供有助于提升他们在日常情境中的自尊和自主性的有用信息的系统中大大受益。尽管现有的光学字符识别程序似乎快速且有效,但其中大多数无法正确识别嵌入在普通全景图像中的文本。本文所描述的方法是使用朴素贝叶斯算法定位文本图像区域并对其进行预处理。一种加权读取技术被用于从复杂的图像区域生成正确的文本数据。通常,由于在早期预处理步骤中提出了过滤,图像会存在一些干扰。为了恢复图像质量,采用梯度和对比度图像方法对输入图像进行处理。随后,使用自适应图像映射增强源图像的对比度。在复杂的退化图像中,笔画宽度变换、加博尔变换和加权朴素贝叶斯分类器方法已被用于分割、特征提取以及检测文本和非文本元素。最后,为了识别分类后的文本数据,正在使用深度神经网络和粒子群优化的融合方法。图像中的文本在识别后被转换为语音输出。数据集IIIT5K用于开发部分,并使用准确率、召回率、精确率和F1分数等参数评估所提出方法的性能。

相似文献

[1]
Deep Learning and Particle Swarm Optimisation-Based Techniques for Visually Impaired Humans' Text Recognition and Identification.

Augment Hum Res. 2021

[2]
Securing healthcare medical image information using advance morphological component analysis, information hiding systems, and hybrid convolutional neural networks on IoMT.

Comput Biol Med. 2025-2

[3]
Brain tumor segmentation and detection in MRI using convolutional neural networks and VGG16.

Cancer Biomark. 2025-3

[4]
Coati optimization algorithm for brain tumor identification based on MRI with utilizing phase-aware composite deep neural network.

Electromagn Biol Med. 2025

[5]
Face mask identification with enhanced cuckoo optimization and deep learning-based faster regional neural network.

Sci Rep. 2024-11-29

[6]
Deep learning based object detection and surrounding environment description for visually impaired people.

Heliyon. 2023-6-7

[7]
Particle Swarm Optimized Fuzzy CNN With Quantitative Feature Fusion for Ultrasound Image Quality Identification.

IEEE J Transl Eng Health Med. 2022

[8]
Self-attention-based generative adversarial network optimized with color harmony algorithm for brain tumor classification.

Electromagn Biol Med. 2024-4-2

[9]
Facilitating clinical research through automation: Combining optical character recognition with natural language processing.

Clin Trials. 2022-10

[10]
Interpreting deep learning models for glioma survival classification using visualization and textual explanations.

BMC Med Inform Decis Mak. 2023-10-18

本文引用的文献

[1]
Tomato Anomalies Detection in Greenhouse Scenarios Based on YOLO-Dense.

Front Plant Sci. 2021-4-9

[2]
Tomato detection based on modified YOLOv3 framework.

Sci Rep. 2021-1-14

[3]
Fighting against COVID-19: A novel deep learning model based on YOLO-v2 with ResNet-50 for medical face mask detection.

Sustain Cities Soc. 2021-2

[4]
SSD vs. YOLO for Detection of Outdoor Urban Advertising Panels under Multiple Variabilities.

Sensors (Basel). 2020-8-15

[5]
Determination of Vehicle Trajectory through Optimization of Vehicle Bounding Boxes Using a Convolutional Neural Network.

Sensors (Basel). 2019-9-30

[6]
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.

IEEE Trans Pattern Anal Mach Intell. 2016-6-6

[7]
Robust Text Detection in Natural Scene Images.

IEEE Trans Pattern Anal Mach Intell. 2014-5

[8]
Word Spotting and Recognition with Embedded Attributes.

IEEE Trans Pattern Anal Mach Intell. 2014-12

[9]
Multi-Orientation Scene Text Detection with Adaptive Clustering.

IEEE Trans Pattern Anal Mach Intell. 2015-9

[10]
Image coding using wavelet transform.

IEEE Trans Image Process. 1992

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索