Dpto. de Lenguajes y Sistemas Informáticos, Universidad de Sevilla, Spain.
Institute of Information Science and Technologies of the National Research Council of Italy (ISTI-CNR), Pisa, Italy.
Neural Netw. 2023 Oct;167:489-501. doi: 10.1016/j.neunet.2023.08.043. Epub 2023 Aug 26.
Violent assaults and homicides occur daily, and the number of victims of mass shootings increases every year. However, this number can be reduced with the help of Closed Circuit Television (CCTV) and weapon detection models, as generic object detectors have become increasingly accurate with more data for training. We present a new semi-supervised learning methodology based on conditioned cooperative student-teacher training with optimal pseudo-label generation using a novel confidence threshold search method and improving both models by conditional knowledge transfer. Furthermore, a novel firearms image dataset of 458,599 images was collected using Instagram hashtags to evaluate our approach and compare the improvements obtained using a specific unsupervised dataset instead of a general one such as ImageNet. We compared our methodology with supervised, semi-supervised and self-supervised learning techniques, outperforming approaches such as YOLOv5 m (up to +19.86), YOLOv5l (up to +6.52) Unbiased Teacher (up to +10.5 AP), DETReg (up to +2.8 AP) and UP-DETR (up to +1.22 AP).
暴力袭击和杀人事件每天都在发生,大规模枪击事件的受害者人数也在逐年增加。然而,借助闭路电视(CCTV)和武器检测模型,可以减少这种情况,因为通用目标检测模型随着训练数据的增加,其准确性也越来越高。我们提出了一种新的半监督学习方法,该方法基于有条件的合作学生-教师培训,并使用新的置信度阈值搜索方法生成最优伪标签,通过条件知识转移来改进两个模型。此外,我们还使用 Instagram 标签收集了一个包含 458599 张图像的新型枪支图像数据集,以评估我们的方法,并比较使用特定的无监督数据集而不是通用数据集(如 ImageNet)获得的改进。我们将我们的方法与监督、半监督和自监督学习技术进行了比较,在检测精度方面优于 YOLOv5m(高达+19.86)、YOLOv5l(高达+6.52)无偏教师(高达+10.5 AP)、DETReg(高达+2.8 AP)和 UP-DETR(高达+1.22 AP)等方法。