• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于神经视频压缩的掩码特征残差编码

Masked Feature Residual Coding for Neural Video Compression.

作者信息

Shin Chajin, Kim Yonghwan, Choi KwangPyo, Lee Sangyoun

机构信息

School of Electrical and Electronic Engineering, Yonsei University, Seoul 03722, Republic of Korea.

Samsung Seoul R&D Campus, Seoul 06765, Republic of Korea.

出版信息

Sensors (Basel). 2025 Jul 17;25(14):4460. doi: 10.3390/s25144460.

DOI:10.3390/s25144460
PMID:40732586
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12299318/
Abstract

In neural video compression, an approximation of the target frame is predicted, and a mask is subsequently applied to it. Then, the masked predicted frame is subtracted from the target frame and fed into the encoder along with the conditional information. However, this structure has two limitations. First, in the pixel domain, even if the mask is perfectly predicted, the residuals cannot be significantly reduced. Second, reconstructed features with abundant temporal context information cannot be used as references for compressing the next frame. To address these problems, we propose Conditional Masked Feature Residual (CMFR) Coding. We extract features from the target frame and the predicted features using neural networks. Then, we predict the mask and subtract the masked predicted features from the target features. Thereafter, the difference is fed into the encoder with the conditional information. Moreover, to more effectively remove conditional information from the target frame, we introduce a Scaled Feature Fusion (SFF) module. In addition, we introduce a Motion Refiner to enhance the quality of the decoded optical flow. Experimental results show that our model achieves an 11.76% bit saving over the model without the proposed methods, averaged over all HEVC test sequences, demonstrating the effectiveness of the proposed methods.

摘要

在神经视频压缩中,首先预测目标帧的近似值,随后对其应用掩码。然后,从目标帧中减去掩码后的预测帧,并将其与条件信息一起输入编码器。然而,这种结构有两个局限性。第一,在像素域中,即使掩码被完美预测,残差也无法显著减少。第二,具有丰富时间上下文信息的重建特征不能用作压缩下一帧的参考。为了解决这些问题,我们提出了条件掩码特征残差(CMFR)编码。我们使用神经网络从目标帧和预测特征中提取特征。然后,我们预测掩码,并从目标特征中减去掩码后的预测特征。此后,将差值与条件信息一起输入编码器。此外,为了更有效地从目标帧中去除条件信息,我们引入了缩放特征融合(SFF)模块。另外,我们引入了运动细化器来提高解码光流的质量。实验结果表明,在所有HEVC测试序列上平均,我们的模型比未采用所提方法的模型节省了11.76%的比特,证明了所提方法的有效性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1418/12299318/30efe937a2a0/sensors-25-04460-g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1418/12299318/0b679b585c8c/sensors-25-04460-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1418/12299318/59fcff7adc7c/sensors-25-04460-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1418/12299318/557317e9f12d/sensors-25-04460-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1418/12299318/17b2770d4819/sensors-25-04460-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1418/12299318/2b6f8bf6dd56/sensors-25-04460-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1418/12299318/2f4a3bd12c3c/sensors-25-04460-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1418/12299318/e278358701f1/sensors-25-04460-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1418/12299318/0c524c9090b7/sensors-25-04460-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1418/12299318/3ac2fa20f926/sensors-25-04460-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1418/12299318/d9c949152741/sensors-25-04460-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1418/12299318/adf1b8d44922/sensors-25-04460-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1418/12299318/2d2e1faae7c2/sensors-25-04460-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1418/12299318/30efe937a2a0/sensors-25-04460-g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1418/12299318/0b679b585c8c/sensors-25-04460-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1418/12299318/59fcff7adc7c/sensors-25-04460-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1418/12299318/557317e9f12d/sensors-25-04460-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1418/12299318/17b2770d4819/sensors-25-04460-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1418/12299318/2b6f8bf6dd56/sensors-25-04460-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1418/12299318/2f4a3bd12c3c/sensors-25-04460-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1418/12299318/e278358701f1/sensors-25-04460-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1418/12299318/0c524c9090b7/sensors-25-04460-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1418/12299318/3ac2fa20f926/sensors-25-04460-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1418/12299318/d9c949152741/sensors-25-04460-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1418/12299318/adf1b8d44922/sensors-25-04460-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1418/12299318/2d2e1faae7c2/sensors-25-04460-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1418/12299318/30efe937a2a0/sensors-25-04460-g013.jpg

相似文献

1
Masked Feature Residual Coding for Neural Video Compression.用于神经视频压缩的掩码特征残差编码
Sensors (Basel). 2025 Jul 17;25(14):4460. doi: 10.3390/s25144460.
2
Short-Term Memory Impairment短期记忆障碍
3
Survivor, family and professional experiences of psychosocial interventions for sexual abuse and violence: a qualitative evidence synthesis.性虐待和暴力的心理社会干预的幸存者、家庭和专业人员的经验:定性证据综合。
Cochrane Database Syst Rev. 2022 Oct 4;10(10):CD013648. doi: 10.1002/14651858.CD013648.pub2.
4
Automated devices for identifying peripheral arterial disease in people with leg ulceration: an evidence synthesis and cost-effectiveness analysis.用于识别下肢溃疡患者外周动脉疾病的自动化设备:证据综合和成本效益分析。
Health Technol Assess. 2024 Aug;28(37):1-158. doi: 10.3310/TWCG3912.
5
Regional cerebral blood flow single photon emission computed tomography for detection of Frontotemporal dementia in people with suspected dementia.用于检测疑似痴呆患者额颞叶痴呆的局部脑血流单光子发射计算机断层扫描
Cochrane Database Syst Rev. 2015 Jun 23;2015(6):CD010896. doi: 10.1002/14651858.CD010896.pub2.
6
Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.系统性药理学治疗慢性斑块状银屑病:网络荟萃分析。
Cochrane Database Syst Rev. 2021 Apr 19;4(4):CD011535. doi: 10.1002/14651858.CD011535.pub4.
7
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
8
The Lived Experience of Autistic Adults in Employment: A Systematic Search and Synthesis.成年自闭症患者的就业生活经历:系统检索与综述
Autism Adulthood. 2024 Dec 2;6(4):495-509. doi: 10.1089/aut.2022.0114. eCollection 2024 Dec.
9
Home treatment for mental health problems: a systematic review.心理健康问题的居家治疗:一项系统综述
Health Technol Assess. 2001;5(15):1-139. doi: 10.3310/hta5150.
10
A New Measure of Quantified Social Health Is Associated With Levels of Discomfort, Capability, and Mental and General Health Among Patients Seeking Musculoskeletal Specialty Care.一种新的量化社会健康指标与寻求肌肉骨骼专科护理的患者的不适程度、能力以及心理和总体健康水平相关。
Clin Orthop Relat Res. 2025 Apr 1;483(4):647-663. doi: 10.1097/CORR.0000000000003394. Epub 2025 Feb 5.

本文引用的文献

1
FDI-VSR: Video Super-Resolution Through Frequency-Domain Integration and Dynamic Offset Estimation.FDI-VSR:通过频域集成和动态偏移估计实现视频超分辨率
Sensors (Basel). 2025 Apr 10;25(8):2402. doi: 10.3390/s25082402.
2
Advanced Imaging Integration: Multi-Modal Raman Light Sheet Microscopy Combined with Zero-Shot Learning for Denoising and Super-Resolution.高级成像集成:多模态拉曼光片显微镜结合零样本学习进行去噪和超分辨率处理。
Sensors (Basel). 2024 Nov 3;24(21):7083. doi: 10.3390/s24217083.
3
A Single-Frame and Multi-Frame Cascaded Image Super-Resolution Method.
一种单帧与多帧级联图像超分辨率方法。
Sensors (Basel). 2024 Aug 28;24(17):5566. doi: 10.3390/s24175566.
4
Real-World Video Super-Resolution with a Degradation-Adaptive Model.基于退化自适应模型的真实世界视频超分辨率
Sensors (Basel). 2024 Mar 29;24(7):2211. doi: 10.3390/s24072211.
5
Taylor Neural Network for Real-World Image Super-Resolution.用于真实世界图像超分辨率的泰勒神经网络。
IEEE Trans Image Process. 2023;32:1942-1951. doi: 10.1109/TIP.2023.3255107. Epub 2023 Mar 24.
6
Joint Video Super-Resolution and Frame Interpolation via Permutation Invariance.基于排列不变性的视频联合超分辨率和帧插值。
Sensors (Basel). 2023 Feb 24;23(5):2529. doi: 10.3390/s23052529.
7
BANet: A Blur-Aware Attention Network for Dynamic Scene Deblurring.BANet:一种用于动态场景去模糊的模糊感知注意力网络。
IEEE Trans Image Process. 2022;31:6789-6799. doi: 10.1109/TIP.2022.3216216. Epub 2022 Oct 28.