Suppr超能文献

使用Transformer架构的实时驾驶员困倦检测:一种新颖的深度学习方法。

Real-time driver drowsiness detection using transformer architectures: a novel deep learning approach.

作者信息

Hassan Osama F, Ibrahim Ahmed F, Gomaa Ahmed, Makhlouf M A, Hafiz B

机构信息

Information Systems Department, Faculty of Computers and Informatics, Suez Canal University, Ismailia, 41522, Egypt.

Artificial Intelligence Department, Faculty of Computer Science and Engineering, King Salman International University (KSIU), South Sinai, 46511, Egypt.

出版信息

Sci Rep. 2025 May 20;15(1):17493. doi: 10.1038/s41598-025-02111-x.

Abstract

Driver drowsiness is a leading cause of road accidents, resulting in significant societal, economic, and emotional losses. This paper introduces a novel and robust deep learning-based framework for real-time driver drowsiness detection, leveraging state-of-the-art transformer architectures and transfer learning models to achieve unprecedented accuracy and reliability. The proposed methodology addresses key challenges in drowsiness detection by integrating advanced data preprocessing techniques, including image normalization, augmentation, and region-of-interest selection using Haar Cascade classifiers. We employ the MRL Eye Dataset to classify eye states into "Open-Eyes" and "Close-Eyes," evaluating a range of models, including Vision Transformer (ViT), Swin Transformer, and fine-tuned transfer learning models such as VGG19, DenseNet169, ResNet50V2, InceptionResNetV2, InceptionV3, and MobileNet. The ViT and Swin Transformer models achieved groundbreaking accuracy rates of 99.15% and 99.03%, respectively, outperforming all other models in precision, recall, and F1-score. To ensure the generalization and robustness of the proposed models, we also evaluate their performance on the NTHU-DDD and CEW datasets, which provide diverse real-world scenarios and challenging conditions. This represents a significant advancement over existing methods, demonstrating the effectiveness of transformer-based architectures in capturing complex spatial dependencies and extracting relevant features for drowsiness detection. The proposed system also incorporates a real-time drowsiness scoring mechanism, which triggers alarms when prolonged eye closure is detected, ensuring timely intervention to prevent accidents. A key novelty of this work lies in the integration of Class Activation Mapping (CAM) for enhanced model interpretability, allowing the system to focus on critical eye regions and improve decision-making transparency. The system was rigorously tested under varying lighting conditions and scenarios involving glasses, showcasing its robustness and adaptability for real-world deployment. By combining cutting-edge deep learning techniques with real-time processing capabilities, this research offers a contactless, reliable, and efficient solution for driver drowsiness detection, significantly contributing to improved road safety and accident prevention. The proposed framework sets a new benchmark in drowsiness detection, highlighting its potential for widespread adoption in advanced driver assistance systems.

摘要

驾驶员困倦是道路交通事故的主要原因之一,会导致巨大的社会、经济和情感损失。本文介绍了一种新颖且强大的基于深度学习的实时驾驶员困倦检测框架,利用最先进的变压器架构和迁移学习模型,实现了前所未有的准确性和可靠性。所提出的方法通过集成先进的数据预处理技术来应对困倦检测中的关键挑战,这些技术包括图像归一化、增强以及使用哈尔级联分类器进行感兴趣区域选择。我们使用MRL眼睛数据集将眼睛状态分类为“睁眼”和“闭眼”,评估了一系列模型,包括视觉变压器(ViT)、Swin变压器以及微调后的迁移学习模型,如VGG19、DenseNet169、ResNet50V2、InceptionResNetV2、InceptionV3和MobileNet。ViT和Swin变压器模型分别取得了99.15%和99.03%的突破性准确率,在精度、召回率和F1分数方面优于所有其他模型。为确保所提出模型的泛化性和鲁棒性,我们还在NTHU-DDD和CEW数据集上评估了它们的性能,这些数据集提供了多样化的现实场景和具有挑战性的条件。这代表了相对于现有方法的重大进步,证明了基于变压器的架构在捕捉复杂空间依赖性和提取用于困倦检测的相关特征方面的有效性。所提出的系统还纳入了实时困倦评分机制,当检测到长时间闭眼时会触发警报,确保及时干预以预防事故。这项工作一个关键的新颖之处在于集成了类激活映射(CAM)以增强模型的可解释性,使系统能够专注于关键的眼睛区域并提高决策透明度。该系统在不同光照条件和涉及眼镜的场景下进行了严格测试,展示了其在实际部署中的鲁棒性和适应性。通过将前沿的深度学习技术与实时处理能力相结合,本研究为驾驶员困倦检测提供了一种非接触式、可靠且高效的解决方案,对改善道路安全和预防事故做出了重大贡献。所提出的框架在困倦检测方面树立了新的标杆,突出了其在先进驾驶员辅助系统中广泛应用的潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/11ee/12092738/49f5094b5fdf/41598_2025_2111_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验