Li Zhu, Schuster Guido M, Katsaggelos Aggelos K, Gandhi Bhavan
Multimedia Research Laboratory (MRL), Motorola Laboratories, Schaumburg, IL 60196, USA.
IEEE Trans Image Process. 2005 Oct;14(10):1550-60. doi: 10.1109/tip.2005.854477.
The need for video summarization originates primarily from a viewing time constraint. A shorter version of the original video sequence is desirable in a number of applications. Clearly, a shorter version is also necessary in applications where storage, communication bandwidth, and/or power are limited. The summarization process inevitably introduces distortion. The amount of summarization distortion is related to its "conciseness," or the number of frames available in the summary. If there are m frames in the original sequence and n frames in the summary, we define the summarization rate as m/n, to characterize this "conciseness". We also develop a new summarization distortion metric and formulate the summarization problem as a rate-distortion optimization problem. Optimal algorithms based on dynamic programming are presented and compared experimentally with heuristic algorithms. Practical constraints, like the maximum number of frames that can be skipped, are also considered in the formulation and solution of the problem.
视频摘要的需求主要源于观看时间限制。在许多应用中,都需要原始视频序列的较短版本。显然,在存储、通信带宽和/或功率受限的应用中,较短版本也是必要的。摘要过程不可避免地会引入失真。摘要失真的量与其“简洁性”相关,即摘要中可用的帧数。如果原始序列中有m帧,摘要中有n帧,我们将摘要率定义为m/n,以表征这种“简洁性”。我们还开发了一种新的摘要失真度量,并将摘要问题表述为率失真优化问题。提出了基于动态规划的最优算法,并与启发式算法进行了实验比较。在问题的表述和求解中,也考虑了实际约束,如可跳过的最大帧数。