利用基于视网膜网络的目标检测模型，借助元启发式优化算法辅助视障人士。

Leveraging retinanet based object detection model for assisting visually impaired individuals with metaheuristic optimization algorithm.

作者信息

Khadidos Alaa O, Yafoz Ayman

机构信息

Department of Information Systems, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia.

King Salman Center for Disability Research, Riyadh, 11614, Saudi Arabia.

出版信息

Sci Rep. 2025 May 8;15(1):15979. doi: 10.1038/s41598-025-99903-y.

DOI:10.1038/s41598-025-99903-y

PMID:40341297

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12062455/

Abstract

Visually impaired individuals suffer many problems in handling their everyday activities like road crossing, writing, finding an object, reading, and so on. However, many navigation methods are available, and efficient object detection (OD) methods for visually impaired people (VIP) need to be improved. OD is the most crucial contribution of computer vision (CV) and plays an important part in recognizing and finding objects in the image. The aged and visually challenged people can identify different objects accurately under some constraint features like scaled, occlusion, illuminated, and blurred nature in different real world. A considerable number of studies are performed in the domain of real-time object recognition using deep learning (DL). The DL-based methods remove the features autonomously and classify and detect the objects. This paper proposes a novel Object Detection Model for Visually Impaired Individuals with a Metaheuristic Optimization Algorithm (ODMVII-MOA) technique. The proposed ODMVII-MOA technique aims to improve OD methods in real-time with advanced techniques to detect and recognize objects for disabled people. At first, the image pre-processing stage applies the Weiner filter (WF) method to enhance image quality by eliminating the unwanted noise from the data. Furthermore, the RetinaNet technique is utilized for the OD process to recognize and locate objects within an image. Besides, the proposed ODMVII-MOA method employs the EfficientNetB0 method for the feature extraction process. For the classification process, the LSTM-AE method is employed. Finally, the Dandelion Optimizer (DO) method adjusts the hyperparameter range of the LSTM-AE method optimally and results in better performance of classification. The experimental validation of the ODMVII-MOA model is verified under the indoor OD dataset and the outcomes are determined regarding different measures. The comparison study of the ODMVII-MOA model portrayed a superior accuracy value of 99.69% over existing techniques.

摘要

视障人士在处理日常活动时会遇到许多问题，比如过马路、写字、找东西、阅读等等。然而，虽然有许多导航方法可用，但针对视障人士的高效目标检测（OD）方法仍有待改进。目标检测是计算机视觉（CV）最重要的贡献，在识别和查找图像中的物体方面发挥着重要作用。老年人和视障人士能够在不同现实世界中诸如缩放、遮挡、光照和模糊等约束特征下准确识别不同物体。在使用深度学习（DL）进行实时目标识别领域已经开展了大量研究。基于深度学习的方法会自动提取特征并对物体进行分类和检测。本文提出了一种采用元启发式优化算法（ODMVII-MOA）技术的视障人士目标检测模型。所提出的ODMVII-MOA技术旨在通过先进技术实时改进目标检测方法，以便为残疾人检测和识别物体。首先，图像预处理阶段应用维纳滤波器（WF）方法，通过消除数据中的不必要噪声来提高图像质量。此外，视网膜网络（RetinaNet）技术用于目标检测过程，以识别和定位图像中的物体。此外，所提出的ODMVII-MOA方法采用高效神经网络B0（EfficientNetB0）方法进行特征提取过程。对于分类过程，采用长短期记忆自动编码器（LSTM-AE）方法。最后，蒲公英优化器（DO）方法对LSTM-AE方法的超参数范围进行最优调整，从而在分类方面取得更好的性能。ODMVII-MOA模型的实验验证在室内目标检测数据集下进行，并根据不同指标确定结果。ODMVII-MOA模型的比较研究表明，其准确率高达99.69%，优于现有技术。