基于深度学习的分块聚类深度视频编码内预测方法。

Intra Prediction Method for Depth Video Coding by Block Clustering through Deep Learning.

机构信息

AI Grand ICT Research Center, Dong-eui University, Busan 47340, Republic of Korea.

Department of Computer Software Engineering, Dong-eui University, Busan 47340, Republic of Korea.

出版信息

Sensors (Basel). 2022 Dec 9;22(24):9656. doi: 10.3390/s22249656.

DOI:10.3390/s22249656

PMID:36560023

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9787791/

Abstract

In this paper, we propose an intra-picture prediction method for depth video by a block clustering through a neural network. The proposed method solves a problem that the block that has two or more clusters drops the prediction performance of the intra prediction for depth video. The proposed neural network consists of both a spatial feature prediction network and a clustering network. The spatial feature prediction network utilizes spatial features in vertical and horizontal directions. The network contains a 1D CNN layer and a fully connected layer. The 1D CNN layer extracts the spatial features for a vertical direction and a horizontal direction from a top block and a left block of the reference pixels, respectively. 1D CNN is designed to handle time-series data, but it can also be applied to find the spatial features by regarding a pixel order in a certain direction as a timestamp. The fully connected layer predicts the spatial features of the block to be coded through the extracted features. The clustering network finds clusters from the spatial features which are the outputs of the spatial feature prediction network. The network consists of 4 CNN layers. The first 3 CNN layers combine two spatial features in the vertical and horizontal directions. The last layer outputs the probabilities that pixels belong to the clusters. The pixels of the block are predicted by the representative values of the clusters that are the average of the reference pixels belonging to the clusters. For the intra prediction for various block sizes, the block is scaled to the size of the network input. The prediction result through the proposed network is scaled back to the original size. In network training, the mean square error is used as a loss function between the original block and the predicted block. A penalty for output values far from both ends is introduced to the loss function for clear network clustering. In the simulation results, the bit rate is saved by up to 12.45% under the same distortion condition compared with the latest video coding standard.

摘要

在本文中，我们通过神经网络提出了一种基于块聚类的深度视频帧内预测方法。所提出的方法解决了具有两个或更多聚类的块会降低深度视频帧内预测性能的问题。所提出的神经网络由空间特征预测网络和聚类网络组成。空间特征预测网络利用垂直和水平方向的空间特征。该网络包含一维卷积神经网络（1D CNN）层和全连接层。1D CNN 层从参考像素的顶部块和左侧块分别提取垂直方向和水平方向的空间特征。1D CNN 层旨在处理时间序列数据，但也可以通过将特定方向上的像素顺序视为时间戳来应用于找到空间特征。全连接层通过提取的特征预测要编码的块的空间特征。聚类网络从空间特征预测网络的输出中找到聚类。该网络由 4 个卷积神经网络（CNN）层组成。前 3 个 CNN 层将垂直和水平方向的两个空间特征进行组合。最后一层输出像素属于聚类的概率。通过聚类的代表值预测块中的像素，聚类的代表值是属于聚类的参考像素的平均值。对于各种块大小的帧内预测，块被缩放为网络输入的大小。通过所提出的网络的预测结果被缩放回原始大小。在网络训练中，原始块和预测块之间的均方误差用作损失函数。为了实现清晰的网络聚类，损失函数中引入了输出值远离两端的惩罚。在仿真结果中，与最新的视频编码标准相比，在相同失真条件下，比特率可节省高达 12.45%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/97b1/9787791/2824dd949ca5/sensors-22-09656-g001.jpg

相似文献

Intra Prediction Method for Depth Video Coding by Block Clustering through Deep Learning.

Sensors (Basel). 2022 Dec 9;22(24):9656. doi: 10.3390/s22249656.

Tree-Structured Data Clustering-Driven Neural Network for Intra Prediction in Video Coding.

IEEE Trans Image Process. 2023;32:3493-3506. doi: 10.1109/TIP.2023.3286256. Epub 2023 Jun 23.

fMRI volume classification using a 3D convolutional neural network robust to shifted and scaled neuronal activations.

Neuroimage. 2020 Dec;223:117328. doi: 10.1016/j.neuroimage.2020.117328. Epub 2020 Sep 5.

Learning hidden patterns from patient multivariate time series data using convolutional neural networks: A case study of healthcare cost prediction.

J Biomed Inform. 2020 Nov;111:103565. doi: 10.1016/j.jbi.2020.103565. Epub 2020 Sep 25.

Partition Map Prediction for Fast Block Partitioning in VVC Intra-Frame Coding.

IEEE Trans Image Process. 2023;32:2237-2251. doi: 10.1109/TIP.2023.3266165. Epub 2023 Apr 21.

Analysis of Main Movement Characteristics of Hip Hop Dance Based on Deep Learning of Dance Movements.

Comput Intell Neurosci. 2022 May 23;2022:6794018. doi: 10.1155/2022/6794018. eCollection 2022.

Improving Intra Prediction in High-Efficiency Video Coding.

IEEE Trans Image Process. 2016 Aug;25(8):3671-82. doi: 10.1109/TIP.2016.2573585. Epub 2016 May 26.

Coarse-to-Fine Network-Based Intra Prediction in Versatile Video Coding.

Sensors (Basel). 2023 Nov 27;23(23):9452. doi: 10.3390/s23239452.

A Lightweight convolutional neural network for nicotine prediction in tobacco by near-infrared spectroscopy.

Front Plant Sci. 2023 May 12;14:1138693. doi: 10.3389/fpls.2023.1138693. eCollection 2023.

A CAD system for pulmonary nodule prediction based on deep three-dimensional convolutional neural networks and ensemble learning.

PLoS One. 2019 Jul 12;14(7):e0219369. doi: 10.1371/journal.pone.0219369. eCollection 2019.

本文引用的文献

An End-to-End Learning Framework for Video Compression.

IEEE Trans Pattern Anal Mach Intell. 2021 Oct;43(10):3292-3308. doi: 10.1109/TPAMI.2020.2988453. Epub 2021 Sep 2.

Deep Learning Approaches to Detect Atrial Fibrillation Using Photoplethysmographic Signals: Algorithms Development Study.

JMIR Mhealth Uhealth. 2019 Jun 6;7(6):e12770. doi: 10.2196/12770.

Relative Pose Based Redundancy Removal: Collaborative RGB-D Data Transmission in Mobile Visual Sensor Networks.

Sensors (Basel). 2018 Jul 26;18(8):2430. doi: 10.3390/s18082430.

Fully Connected Network-Based Intra Prediction for Image Coding.

IEEE Trans Image Process. 2018 Jul;27(7):3236-3247. doi: 10.1109/TIP.2018.2817044.

Motion-Compensated Compression of Dynamic Voxelized Point Clouds.

IEEE Trans Image Process. 2017 Aug;26(8):3886-3895. doi: 10.1109/TIP.2017.2707807. Epub 2017 May 24.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于深度学习的分块聚类深度视频编码内预测方法。

Intra Prediction Method for Depth Video Coding by Block Clustering through Deep Learning.

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献