Graduate School of Science and Technology, Keio University, Yokohama 223-8522, Japan.
Faculty of Science and Technology, Keio University, Yokohama 223-8522, Japan.
Sensors (Basel). 2024 Oct 13;24(20):6604. doi: 10.3390/s24206604.
Secondary actions in vehicles are activities that drivers engage in while driving that are not directly related to the primary task of operating the vehicle. Secondary Action Recognition (SAR) in drivers is vital for enhancing road safety and minimizing accidents related to distracted driving. It also plays an important part in modern car driving systems such as Advanced Driving Assistance Systems (ADASs), as it helps identify distractions and predict the driver's intent. Traditional methods of action recognition in vehicles mostly rely on RGB videos, which can be significantly impacted by external conditions such as low light levels. In this research, we introduce a novel method for SAR. Our approach utilizes depth-video data obtained from a depth sensor located in a vehicle. Our methodology leverages the Convolutional Neural Network (CNN), which is enhanced by the Spatial Enhanced Attention Mechanism (SEAM) and combined with Bidirectional Long Short-Term Memory (Bi-LSTM) networks. This method significantly enhances action recognition ability in depth videos by improving both the spatial and temporal aspects. We conduct experiments using K-fold cross validation, and the experimental results show that on the public benchmark dataset Drive&Act, our proposed method shows significant improvement in SAR compared to the state-of-the-art methods, reaching an accuracy of about 84% in SAR in depth videos.
车辆中的次要动作是驾驶员在驾驶时进行的与操作车辆的主要任务不直接相关的活动。驾驶员的次要动作识别(SAR)对于提高道路安全和最大限度地减少与分心驾驶相关的事故至关重要。它在现代驾驶系统中也起着重要作用,如高级驾驶辅助系统(ADAS),因为它有助于识别分心并预测驾驶员的意图。车辆中传统的动作识别方法主要依赖于 RGB 视频,而这些视频可能会受到外部条件的显著影响,如低光照水平。在这项研究中,我们引入了一种新的 SAR 方法。我们的方法利用了从车辆中安装的深度传感器获得的深度视频数据。我们的方法利用了卷积神经网络(CNN),通过空间增强注意力机制(SEAM)进行增强,并结合了双向长短期记忆(Bi-LSTM)网络。这种方法通过提高空间和时间方面的性能,显著提高了深度视频中的动作识别能力。我们使用 K 折交叉验证进行实验,实验结果表明,在公共基准数据集 Drive&Act 上,与最先进的方法相比,我们提出的方法在 SAR 方面有显著的改进,在深度视频中的 SAR 准确性达到了约 84%。