通过深度学习技术实现的人工智能驱动的视频摘要，用于优化内容检索和管理。

AI-driven video summarization for optimizing content retrieval and management through deep learning techniques.

作者信息

Vora Deepali, Kadam Payal, Mohite Dadaso D, Kumar Nilesh, Kumar Nimit, Radhakrishnan Pratheeik, Bhagwat Shalmali

机构信息

Symbiosis Institute of Technology, Pune Campus, Symbiosis International (Deemed University), Pune, India.

Bharati Vidyapeeth (Deemed to be University) College of Engineering, Pune, 411043, India.

出版信息

Sci Rep. 2025 Feb 3;15(1):4058. doi: 10.1038/s41598-025-87824-9.

DOI:10.1038/s41598-025-87824-9

PMID:39901035

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11791181/

Abstract

With the rapid advancement of artificial intelligence, questions are increasingly being raised by stakeholders regarding how such technologies can enhance the environmental, social, and governance outcomes of organizations. In this study, challenges related to the organization and retrieval of video content within large, heterogeneous media archives are addressed. Existing methods, often reliant on human intervention or low-complexity algorithms, are observed to struggle with the growing demands of online video quantity and quality. To address these limitations, a novel approach is proposed, where convolutional neural networks and long short-term memory networks are utilized to extract both frame-level and temporal video features. Residual networks 50 (ResNet50) is integrated for enhanced content representation, and two-frame video flow is employed to improve system performance. The framework achieves precision, recall, and F-score of 79.2%, 86.5%, and 83%, respectively, on the YouTube, EPFL, and TVSum datasets. Beyond technological advancements, opportunities for effective content management are highlighted, emphasizing the promotion of sustainable digital practices. By minimizing data duplication and optimizing resource usage, scalable solutions for large media collections are supported by the proposed system.

摘要

随着人工智能的迅速发展，利益相关者越来越多地提出有关此类技术如何能够提升组织的环境、社会和治理成果的问题。在本研究中，解决了与大型异构媒体档案库中视频内容的组织和检索相关的挑战。观察发现，现有方法通常依赖人工干预或低复杂度算法，难以应对在线视频数量和质量不断增长的需求。为解决这些局限性，提出了一种新颖的方法，其中利用卷积神经网络和长短期记忆网络来提取帧级和时间视频特征。集成了残差网络50（ResNet50）以增强内容表示，并采用两帧视频流来提高系统性能。该框架在YouTube、EPFL和TVSum数据集上分别实现了79.2%、86.5%和83%的精确率、召回率和F值。除了技术进步之外，还强调了有效内容管理的机会，强调推广可持续的数字实践。通过最小化数据重复并优化资源使用，所提出的系统支持针对大型媒体集合的可扩展解决方案。