Xiao Jun, Li Honghan, Zhao Ji
School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan, 114051, China.
Sci Rep. 2025 May 25;15(1):18256. doi: 10.1038/s41598-025-02833-y.
In response to the demand for efficient and accurate recognition of traffic police gestures by driverless vehicles, this paper introduces a novel traffic police gesture recognition framework (Novel Traffic Police Gesture Recognizer, NTPGR). Initially, keypoints related to traffic police gestures are extracted using the Efficient Progressive Feature Fusion Network (EPFFNet), followed by feature modeling and fusion to enable the recognition network to better learn the temporal characteristics of gestures. Additionally, a convolution network branch and a hybrid attention branch are incorporated to further extract skeleton information from the traffic police gesture data, assign different temporal weights to key frames, and enhance the focus on important channels. Finally, in conjunction with Long Short Term Memory (LSTM), a multi-branch gesture recognition network, termed the Multi-Sequence Gesture Recognition Network (MSNet), is proposed to facilitate the integration of three branches of gesture features, thereby enhancing the targeted extraction of temporal characteristics in traffic police gestures. Experimental results indicate that NTPGR achieves 97.56% and 96.76% accuracy on the Police Gesture Dataset and UTD-MHAD Dataset, respectively, as well as average response times of 0.76s and 0.74s. It not only recognizes traffic police gestures in real-time with high efficiency but also demonstrates strong robustness and Credibility in recognizing gestures in complex environments and dynamic scenarios.
针对无人驾驶车辆对高效准确识别交警手势的需求,本文介绍了一种新颖的交警手势识别框架(Novel Traffic Police Gesture Recognizer,NTPGR)。首先,使用高效渐进特征融合网络(EPFFNet)提取与交警手势相关的关键点,然后进行特征建模和融合,使识别网络能够更好地学习手势的时间特征。此外,引入了一个卷积网络分支和一个混合注意力分支,以进一步从交警手势数据中提取骨架信息,为关键帧分配不同的时间权重,并增强对重要通道的关注。最后,结合长短期记忆(LSTM),提出了一种多分支手势识别网络,称为多序列手势识别网络(MSNet),以促进手势特征的三个分支的整合,从而增强对交警手势时间特征的针对性提取。实验结果表明,NTPGR在交警手势数据集和UTD-MHAD数据集上的准确率分别达到97.56%和96.76%,平均响应时间分别为0.76秒和0.74秒。它不仅能高效实时地识别交警手势,而且在复杂环境和动态场景中识别手势时表现出强大的鲁棒性和可信度。