• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于深度学习和粒子群优化的视障人士文本识别与鉴定技术

Deep Learning and Particle Swarm Optimisation-Based Techniques for Visually Impaired Humans' Text Recognition and Identification.

作者信息

Pandey Binay Kumar, Pandey Digvijay, Wariya Subodh, Aggarwal Gaurav, Rastogi Rahul

机构信息

Department of Information Technology, College of Technology, Govind Ballabh Pant University of Agriculture and Technology, Pantnagar, Uttarakhand India.

Department of Computer Science and Engineering, Invertis University, Bareilly, India.

出版信息

Augment Hum Res. 2021;6(1):14. doi: 10.1007/s41133-021-00051-5. Epub 2021 Oct 29.

DOI:10.1007/s41133-021-00051-5
PMID:40477829
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8553597/
Abstract

Blind people can benefit greatly from a system capable of localising and reading comprehension text embedded in natural scenes and providing useful information that boosts their self-esteem and autonomy in everyday situations. Regardless of the fact that existing optical character recognition programmes seem to be quick and effective, the majority of them are not able to correctly recognise text embedded in usual panorama images. The methodology described in this paper is to localise textual image regions and pre-process them using the naïve Bayesian algorithm. A weighted reading technique is used to generate the correct text data from the complicated image regions. Usually, images hold some disturbance as a result of the fact that filtration is proposed during the early pre-processing step. To restore the image's quality, the input image is processed employing gradient and contrast image methods. Following that, the contrast of the source images would be enhanced using an adaptive image map. The stroke width transform, Gabor's transform, and weighted naïve Bayesian classifier methodologies have been used in complicated degraded images to segment, feature extraction, and detect textual and non-textual elements. Finally, to identify categorised textual data, the confluence of deep neural networks and particle swarm optimisation is being used. The text in the image is transformed into an acoustic output after identification. The dataset IIIT5K is used for the development portion, and the performance of the suggested come up is evaluated using parameters such as accuracy, recall, precision, and F1-score.

摘要

盲人能够从一个能够定位并理解嵌入自然场景中的文本、并提供有助于提升他们在日常情境中的自尊和自主性的有用信息的系统中大大受益。尽管现有的光学字符识别程序似乎快速且有效,但其中大多数无法正确识别嵌入在普通全景图像中的文本。本文所描述的方法是使用朴素贝叶斯算法定位文本图像区域并对其进行预处理。一种加权读取技术被用于从复杂的图像区域生成正确的文本数据。通常,由于在早期预处理步骤中提出了过滤,图像会存在一些干扰。为了恢复图像质量,采用梯度和对比度图像方法对输入图像进行处理。随后,使用自适应图像映射增强源图像的对比度。在复杂的退化图像中,笔画宽度变换、加博尔变换和加权朴素贝叶斯分类器方法已被用于分割、特征提取以及检测文本和非文本元素。最后,为了识别分类后的文本数据,正在使用深度神经网络和粒子群优化的融合方法。图像中的文本在识别后被转换为语音输出。数据集IIIT5K用于开发部分,并使用准确率、召回率、精确率和F1分数等参数评估所提出方法的性能。

相似文献

1
Deep Learning and Particle Swarm Optimisation-Based Techniques for Visually Impaired Humans' Text Recognition and Identification.基于深度学习和粒子群优化的视障人士文本识别与鉴定技术
Augment Hum Res. 2021;6(1):14. doi: 10.1007/s41133-021-00051-5. Epub 2021 Oct 29.
2
Securing healthcare medical image information using advance morphological component analysis, information hiding systems, and hybrid convolutional neural networks on IoMT.利用先进的形态成分分析、信息隐藏系统和物联网上的混合卷积神经网络来保护医疗图像信息。
Comput Biol Med. 2025 Feb;185:109499. doi: 10.1016/j.compbiomed.2024.109499. Epub 2024 Dec 5.
3
Brain tumor segmentation and detection in MRI using convolutional neural networks and VGG16.使用卷积神经网络和VGG16在磁共振成像(MRI)中进行脑肿瘤分割与检测
Cancer Biomark. 2025 Mar;42(3):18758592241311184. doi: 10.1177/18758592241311184. Epub 2025 Apr 4.
4
Coati optimization algorithm for brain tumor identification based on MRI with utilizing phase-aware composite deep neural network.基于磁共振成像(MRI)利用相位感知复合深度神经网络的用于脑肿瘤识别的浣熊优化算法。
Electromagn Biol Med. 2025;44(2):119-136. doi: 10.1080/15368378.2024.2401540. Epub 2025 Jan 21.
5
Face mask identification with enhanced cuckoo optimization and deep learning-based faster regional neural network.基于增强型布谷鸟优化和深度学习的更快区域神经网络的面部口罩识别。
Sci Rep. 2024 Nov 29;14(1):29719. doi: 10.1038/s41598-024-78746-z.
6
Deep learning based object detection and surrounding environment description for visually impaired people.基于深度学习的视障人士目标检测与周围环境描述
Heliyon. 2023 Jun 7;9(6):e16924. doi: 10.1016/j.heliyon.2023.e16924. eCollection 2023 Jun.
7
Particle Swarm Optimized Fuzzy CNN With Quantitative Feature Fusion for Ultrasound Image Quality Identification.基于定量特征融合的粒子群优化模糊 CNN 在超声图像质量识别中的应用。
IEEE J Transl Eng Health Med. 2022 Aug 10;10:1800712. doi: 10.1109/JTEHM.2022.3197923. eCollection 2022.
8
Self-attention-based generative adversarial network optimized with color harmony algorithm for brain tumor classification.基于自注意力的生成对抗网络,结合颜色调和算法,用于脑肿瘤分类。
Electromagn Biol Med. 2024 Apr 2;43(1-2):31-45. doi: 10.1080/15368378.2024.2312363. Epub 2024 Feb 18.
9
Facilitating clinical research through automation: Combining optical character recognition with natural language processing.通过自动化促进临床研究:结合光学字符识别和自然语言处理。
Clin Trials. 2022 Oct;19(5):504-511. doi: 10.1177/17407745221093621. Epub 2022 May 24.
10
Interpreting deep learning models for glioma survival classification using visualization and textual explanations.使用可视化和文本解释来解释深度学习模型在脑胶质瘤生存分类中的应用。
BMC Med Inform Decis Mak. 2023 Oct 18;23(1):225. doi: 10.1186/s12911-023-02320-2.

本文引用的文献

1
Tomato Anomalies Detection in Greenhouse Scenarios Based on YOLO-Dense.基于YOLO-Dense的温室场景下番茄异常检测
Front Plant Sci. 2021 Apr 9;12:634103. doi: 10.3389/fpls.2021.634103. eCollection 2021.
2
Tomato detection based on modified YOLOv3 framework.基于改进的 YOLOv3 框架的番茄检测。
Sci Rep. 2021 Jan 14;11(1):1447. doi: 10.1038/s41598-021-81216-5.
3
Fighting against COVID-19: A novel deep learning model based on YOLO-v2 with ResNet-50 for medical face mask detection.抗击新冠疫情:一种基于带有ResNet-50的YOLO-v2的新型深度学习模型用于医用口罩检测
Sustain Cities Soc. 2021 Feb;65:102600. doi: 10.1016/j.scs.2020.102600. Epub 2020 Nov 12.
4
SSD vs. YOLO for Detection of Outdoor Urban Advertising Panels under Multiple Variabilities.基于多种变化的户外城市广告看板检测的 SSD 与 YOLO 比较
Sensors (Basel). 2020 Aug 15;20(16):4587. doi: 10.3390/s20164587.
5
Determination of Vehicle Trajectory through Optimization of Vehicle Bounding Boxes Using a Convolutional Neural Network.利用卷积神经网络优化车辆包围盒来确定车辆轨迹。
Sensors (Basel). 2019 Sep 30;19(19):4263. doi: 10.3390/s19194263.
6
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.更快的 R-CNN:基于区域建议网络的实时目标检测。
IEEE Trans Pattern Anal Mach Intell. 2017 Jun;39(6):1137-1149. doi: 10.1109/TPAMI.2016.2577031. Epub 2016 Jun 6.
7
Robust Text Detection in Natural Scene Images.自然场景图像中的鲁棒文本检测。
IEEE Trans Pattern Anal Mach Intell. 2014 May;36(5):970-83. doi: 10.1109/TPAMI.2013.182.
8
Word Spotting and Recognition with Embedded Attributes.基于嵌入式属性的字词定位与识别。
IEEE Trans Pattern Anal Mach Intell. 2014 Dec;36(12):2552-66. doi: 10.1109/TPAMI.2014.2339814.
9
Multi-Orientation Scene Text Detection with Adaptive Clustering.多方向场景文本检测的自适应聚类方法。
IEEE Trans Pattern Anal Mach Intell. 2015 Sep;37(9):1930-7. doi: 10.1109/TPAMI.2014.2388210.
10
Image coding using wavelet transform.基于小波变换的图像编码。
IEEE Trans Image Process. 1992;1(2):205-20. doi: 10.1109/83.136597.