Kaur Sumandeep, Kaur Lakhwinder, Lal Madan
Department of Computer Science and Engineering, Punjabi University, Patiala, 147001, India.
Sci Rep. 2024 Nov 4;14(1):26651. doi: 10.1038/s41598-024-75923-y.
Key frame extraction is very important in video summarization and content-based video analysis to address the problem of data redundancy in a video. Key frame extraction enables quick navigation and expert video arrangement in many applications. The visually impaired can benefit from the use of key frame extraction for rapid object recognition and tracking. Most key frame extraction techniques consider only a single visual feature instead of multiple features or full pictorial information of the video. This study proposes a key frame extraction method from a video that (i) first removes insignificant frames by pre-processing, (ii) second, four visual and structural feature differences among the consecutive frames are extracted and aggregated to identify informative frames, (iii) third, to cluster the obtained frames, a hybrid FCM-AHA method is proposed by combining Fuzzy C-means(FCM) with artificial hummingbird optimization algorithm (AHA) to circumvent the local minima trapping problem of FCM, and finally, from each cluster, the two frames having greatest Euclidean distance from all the other frames within a cluster is selected as key frames to remove redundant frames. Experimental results on the Open video and YouTube datasets show that the suggested method outperforms state-of-the-art methods both in terms of subjective qualitative analysis and objective quantitative evaluation, e.g., Precision, Recall, and F-score. Further, results are also taken on real video to demonstrate its applicability in real-life applications.
关键帧提取在视频摘要和基于内容的视频分析中非常重要,以解决视频中的数据冗余问题。关键帧提取在许多应用中实现了快速导航和专业的视频编排。视障人士可以从使用关键帧提取进行快速目标识别和跟踪中受益。大多数关键帧提取技术只考虑单一视觉特征,而不是视频的多个特征或完整图像信息。本研究提出了一种从视频中提取关键帧的方法,该方法:(i)首先通过预处理去除无关紧要的帧;(ii)其次,提取并汇总连续帧之间的四个视觉和结构特征差异,以识别信息丰富的帧;(iii)第三,为了对获得的帧进行聚类,提出了一种将模糊C均值(FCM)与人工蜂鸟优化算法(AHA)相结合的混合FCM-AHA方法,以规避FCM的局部极小值陷阱问题,最后,从每个聚类中,选择与聚类内所有其他帧欧氏距离最大的两帧作为关键帧,以去除冗余帧。在开放视频和YouTube数据集上的实验结果表明,该方法在主观定性分析和客观定量评估(如精度、召回率和F分数)方面均优于现有方法。此外,还对真实视频进行了测试,以证明其在实际应用中的适用性。