文献检索，用中文搜 PubMed

应用&插件

Zotero 插件浏览器插件 Mac 客户端 Windows 客户端微信小程序

定价

高级版会员购买积分包购买API积分包

服务

文献检索文档翻译深度研究 API 文档 MCP 服务

关于我们

关于 Suppr 公司介绍联系我们用户协议隐私条款

关注我们

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

粤ICP备2023148730 号-1Suppr @ 2026

Vision-Language Pre-training (VLP) has shown promising performance in various tasks by learning a generic image-text representation space. However, most existing VLP methods encounter the Noisy Correspondence (NC) problem which refers to wrongly matched image-text pairs harvested from the wild. In this paper, we empirically study the influence of NC on the VLP model and obtain the following two observations. First, the NC will largely degrade the performance in downstream tasks even via fine-tuning, indicating the necessity of handling NC in the pre-training period. Second, the influence of NC varies in different pre-training objectives, suggesting the objective-customized solution for achieving NC robustness. Based on the above observations, we propose a novel NoisE-robust Vision-languagE pRe-training method (NEVER) to endow the VLP model with robustness against NC. In brief, NEVER first divides the training data into clean and noisy subsets in a progressive and adaptive manner. Then NEVER employs the positive learning (PL) and negative learning (NL) on the splits to enjoy model convergence and noise robustness, respectively. To further handle the false negative in PL and NL, NEVER proposes to smoothen and sharpen the training targets with the predictions from a twin momentum model. Extensive experiments on the various V+L tasks verify the effectiveness of the proposed method.

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

Noise-Robust Vision-Language Pre-Training With Positive-Negative Learning.

作者信息

出版信息

相似文献

Noise-Robust Vision-Language Pre-Training With Positive-Negative Learning.

作者信息

出版信息

相似文献