Naphade M R, Huang T S
IBM Thomas J. Watson Res. Center, Hawthorne, NY, USA.
IEEE Trans Neural Netw. 2002;13(4):793-810. doi: 10.1109/TNN.2002.1021881.
Multimedia understanding is a fast emerging interdisciplinary research area. There is tremendous potential for effective use of multimedia content through intelligent analysis. Diverse application areas are increasingly relying on multimedia understanding systems. Advances in multimedia understanding are related directly to advances in signal processing, computer vision, pattern recognition, multimedia databases, and smart sensors. We review the state-of-the-art techniques in multimedia retrieval. In particular, we discuss how multimedia retrieval can be viewed as a pattern recognition problem. We discuss how reliance on powerful pattern recognition and machine learning techniques is increasing in the field of multimedia retrieval. We review the state-of-the-art multimedia understanding systems with particular emphasis on a system for semantic video indexing centered around multijects and multinets. We discuss how semantic retrieval is centered around concepts and context and the various mechanisms for modeling concepts and context.
多媒体理解是一个快速兴起的跨学科研究领域。通过智能分析有效利用多媒体内容具有巨大潜力。不同的应用领域越来越依赖多媒体理解系统。多媒体理解的进展与信号处理、计算机视觉、模式识别、多媒体数据库和智能传感器的进展直接相关。我们回顾了多媒体检索中的最新技术。特别是,我们讨论了如何将多媒体检索视为一个模式识别问题。我们讨论了在多媒体检索领域对强大的模式识别和机器学习技术的依赖是如何增加的。我们回顾了最新的多媒体理解系统,特别强调了一个以多对象和多网络为中心的语义视频索引系统。我们讨论了语义检索如何以概念和上下文为中心以及用于建模概念和上下文的各种机制。