Department of Civil Engineering, College of Engineering, Ocean University of China, Qingdao 266100, China.
Sensors (Basel). 2021 Nov 8;21(21):7424. doi: 10.3390/s21217424.
With the rapid spreading of in-vehicle information systems such as smartphones, navigation systems, and radios, the number of traffic accidents caused by driver distractions shows an increasing trend. Timely identification and warning are deemed to be crucial for distracted driving and the establishment of driver assistance systems is of great value. However, almost all research on the recognition of the driver's distracted actions using computer vision methods neglected the importance of temporal information for action recognition. This paper proposes a hybrid deep learning model for recognizing the actions of distracted drivers. Specifically, we used OpenPose to obtain skeleton information of the human body and then constructed the vector angle and modulus ratio of the human body structure as features to describe the driver's actions, thereby realizing the fusion of deep network features and artificial features, which improve the information density of spatial features. The K-means clustering algorithm was used to preselect the original frames, and the method of inter-frame comparison was used to obtain the final keyframe sequence by comparing the Euclidean distance between manually constructed vectors representing frames and the vector representing the cluster center. Finally, we constructed a two-layer long short-term memory neural network to obtain more effective spatiotemporal features, and one softmax layer to identify the distracted driver's action. The experimental results based on the collected dataset prove the effectiveness of this framework, and it can provide a theoretical basis for the establishment of vehicle distraction warning systems.
随着智能手机、导航系统和收音机等车载信息系统的快速普及,因驾驶员分神而导致的交通事故数量呈上升趋势。及时识别和预警对于分神驾驶至关重要,建立驾驶员辅助系统具有重要意义。然而,几乎所有使用计算机视觉方法识别驾驶员分神行为的研究都忽略了时间信息对于动作识别的重要性。本文提出了一种用于识别分心驾驶员行为的混合深度学习模型。具体来说,我们使用 OpenPose 来获取人体的骨骼信息,然后构建人体结构的向量角度和模比作为特征来描述驾驶员的行为,从而实现了深度网络特征和人工特征的融合,提高了空间特征的信息密度。使用 K-means 聚类算法预选原始帧,并使用帧间比较的方法,通过比较手动构建的表示帧的向量与表示聚类中心的向量之间的欧几里得距离,获得最终的关键帧序列。最后,我们构建了一个两层长短期记忆神经网络来获得更有效的时空特征,并使用一个 softmax 层来识别分心驾驶员的行为。基于收集的数据集进行的实验结果证明了该框架的有效性,可为车辆分神警告系统的建立提供理论依据。