Shaanxi Normal University, Xi'an, Shaanxi 710000, China.
Comput Intell Neurosci. 2021 Nov 24;2021:1825273. doi: 10.1155/2021/1825273. eCollection 2021.
Traditional text annotation-based video retrieval is done by manually labeling videos with text, which is inefficient and highly subjective and generally cannot accurately describe the meaning of videos. Traditional content-based video retrieval uses convolutional neural networks to extract the underlying feature information of images to build indexes and achieves similarity retrieval of video feature vectors according to certain similarity measure algorithms. In this paper, by studying the characteristics of sports videos, we propose the histogram difference method based on using transfer learning and the four-step method based on block matching for mutation detection and fading detection of video shots, respectively. By adaptive thresholding, regions with large frame difference changes are marked as candidate regions for shots, and then the shot boundaries are determined by mutation detection algorithm. Combined with the characteristics of sports video, this paper proposes a key frame extraction method based on clustering and optical flow analysis, and experimental comparison with the traditional clustering method. In addition, this paper proposes a key frame extraction algorithm based on clustering and optical flow analysis for key frame extraction of sports video. The algorithm effectively removes the redundant frames, and the extracted key frames are more representative. Through extensive experiments, the keyword fuzzy finding algorithm based on improved deep neural network and ontology semantic expansion proposed in this paper shows a more desirable retrieval performance, and it is feasible to use this method for video underlying feature extraction, annotation, and keyword finding, and one of the outstanding features of the algorithm is that it can quickly and effectively retrieve the desired video in a large number of Internet video resources, reducing the false detection rate and leakage rate while improving the fidelity, which basically meets people's daily needs.
基于传统文本标注的视频检索是通过手动对视频进行文本标注来完成的,这种方法效率低下且主观性强,通常无法准确描述视频的含义。基于传统内容的视频检索使用卷积神经网络来提取图像的底层特征信息,以构建索引,并根据某些相似性度量算法实现视频特征向量的相似性检索。在本文中,通过研究体育视频的特点,我们分别提出了基于迁移学习的直方图差值法和基于块匹配的四步法,用于视频镜头的突变检测和渐变检测。通过自适应阈值处理,将帧差变化较大的区域标记为候选镜头区域,然后通过突变检测算法确定镜头边界。结合体育视频的特点,本文提出了一种基于聚类和光流分析的关键帧提取方法,并与传统聚类方法进行了实验比较。此外,本文还提出了一种基于聚类和光流分析的关键帧提取算法,用于体育视频的关键帧提取。该算法有效地去除了冗余帧,提取的关键帧更具代表性。通过广泛的实验,本文提出的基于改进深度神经网络和本体语义扩展的关键词模糊查找算法表现出更理想的检索性能,并且可以将这种方法用于视频底层特征提取、标注和关键词查找。该算法的一个突出特点是可以在大量的互联网视频资源中快速有效地检索到所需的视频,在提高保真度的同时降低误检率和漏检率,基本满足人们的日常需求。