Khairnar Smita, Thepade Sudeep D, Kolekar Suresh, Gite Shilpa, Pradhan Biswajeet, Alamri Abdullah, Patil Bhagyesha, Dahake Shrutee, Gaikwad Radhika, Chaudhari Atharva
Department of Computer Engineering, Pimpri Chinchwad College of Engineering, Nigdi, Pune 411044, India.
PCET's, Pimpri Chinchwad University, Pune, India.
MethodsX. 2024 Dec 21;14:103131. doi: 10.1016/j.mex.2024.103131. eCollection 2025 Jun.
Recent advancements in artificial intelligence (AI) have increased interest in intelligent transportation systems, particularly autonomous vehicles. Safe navigation in traffic-heavy environments requires accurate road scene segmentation, yet traditional computer vision methods struggle with complex scenarios. This study emphasizes the role of deep learning in improving semantic segmentation using datasets like the Indian Driving Dataset (IDD), which presents unique challenges in chaotic road conditions. We propose a modified CANet that incorporates U-Net and LinkNet elements, focusing on accuracy, efficiency, and resilience. The CANet features an encoder-decoder architecture and a Multiscale Context Module (MCM) with three parallel branches to capture contextual information at multiple scales. Our experiments show that the proposed model achieves a mean Intersection over Union (mIoU) value of 0.7053, surpassing state-of-the-art models in efficiency and performance. Here we demonstrate:•Traditional computer vision methods struggle with complex driving scenarios, but deep learning based semantic segmentation methods show promising results.•Modified CANet, incorporating U-Net and LinkNet elements is proposed for semantic segmentation of unstructured driving scenarios.•The CANet structure consists of an encoder-decoder architecture and a Multiscale Context Module (MCM) with three parallel branches to capture contextual information at multiple scales.
人工智能(AI)的最新进展引发了人们对智能交通系统,尤其是自动驾驶车辆的更多关注。在交通繁忙的环境中进行安全导航需要精确的道路场景分割,然而传统的计算机视觉方法在复杂场景中却面临困难。本研究强调了深度学习在利用印度驾驶数据集(IDD)等数据集改进语义分割方面的作用,该数据集在混乱的道路条件下提出了独特的挑战。我们提出了一种改进的CANet,它融合了U-Net和LinkNet的元素,重点关注准确性、效率和适应性。CANet具有编码器-解码器架构和一个多尺度上下文模块(MCM),该模块有三个并行分支,用于在多个尺度上捕捉上下文信息。我们的实验表明,所提出的模型实现了0.7053的平均交并比(mIoU)值,在效率和性能方面超过了现有最先进的模型。在此我们展示:
•传统计算机视觉方法在复杂驾驶场景中面临困难,但基于深度学习的语义分割方法显示出有前景的结果。
•提出了融合U-Net和LinkNet元素的改进CANet,用于非结构化驾驶场景的语义分割。
•CANet结构由编码器-解码器架构和一个多尺度上下文模块(MCM)组成,该模块有三个并行分支,用于在多个尺度上捕捉上下文信息。