• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

TapWeight:用于任务自适应预训练的重新加权预训练目标

TapWeight: Reweighting Pretraining Objectives for Task-Adaptive Pretraining.

作者信息

Zhang Ruiyi, Somayajula Sai Ashish, Xie Pengtao

机构信息

UC San Diego.

出版信息

Transact Mach Learn Res. 2025 Jun;2025.

PMID:40855868
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12377235/
Abstract

Large-scale general domain pretraining followed by downstream-specific finetuning has become a predominant paradigm in machine learning. However, discrepancies between the pretraining and target domains can still lead to performance degradation in certain cases, underscoring the need for task-adaptive continued pretraining (TAP). TAP methods typically involve continued pretraining on task-specific unlabeled datasets or introducing additional unsupervised learning objectives to enhance model capabilities. While many TAP methods perform continued pretraining with multiple pretraining objectives, they often determine the tradeoff parameters between objectives manually, resulting in suboptimal outcomes and higher computational costs. In this paper, we propose TapWeight, a task-adaptive pretraining framework which automatically determines the optimal importance of each pretraining objective based on downstream feedback. TapWeight reweights each pretraining objective by solving a multi-level optimization problem. We applied TapWeight to both molecular property prediction and natural language processing tasks, significantly surpassing baseline methods. Experimental results validate the effectiveness and generalizability of TapWeight. Our code is available at https://github.com/ruz048/TapWeight.

摘要

先进行大规模通用领域预训练,然后进行特定于下游任务的微调,已成为机器学习中的一种主要范式。然而,预训练域和目标域之间的差异在某些情况下仍可能导致性能下降,这凸显了任务自适应持续预训练(TAP)的必要性。TAP方法通常包括在特定于任务的未标记数据集上进行持续预训练,或引入额外的无监督学习目标以增强模型能力。虽然许多TAP方法使用多个预训练目标进行持续预训练,但它们通常手动确定目标之间的权衡参数,导致结果次优且计算成本更高。在本文中,我们提出了TapWeight,这是一个任务自适应预训练框架,它基于下游反馈自动确定每个预训练目标的最佳重要性。TapWeight通过解决一个多级优化问题来重新权衡每个预训练目标。我们将TapWeight应用于分子性质预测和自然语言处理任务,显著超越了基线方法。实验结果验证了TapWeight的有效性和通用性。我们的代码可在https://github.com/ruz048/TapWeight获取。

相似文献

1
TapWeight: Reweighting Pretraining Objectives for Task-Adaptive Pretraining.TapWeight:用于任务自适应预训练的重新加权预训练目标
Transact Mach Learn Res. 2025 Jun;2025.
2
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
3
Large-scale convolutional neural network for clinical target and multi-organ segmentation in gynecologic brachytherapy via multi-stage learning.基于多阶段学习的大规模卷积神经网络用于妇科近距离放疗中的临床靶区和多器官分割
Med Phys. 2025 Aug;52(8):e18067. doi: 10.1002/mp.18067.
4
HEART: Learning better representation of EHR data with a heterogeneous relation-aware transformer.心脏:使用异构关系感知转换器学习更好的 EHR 数据表示。
J Biomed Inform. 2024 Nov;159:104741. doi: 10.1016/j.jbi.2024.104741. Epub 2024 Oct 29.
5
Short-Term Memory Impairment短期记忆障碍
6
Domain-Specific Pretraining of NorDeClin-Bidirectional Encoder Representations From Transformers for Code Prediction in Norwegian Clinical Texts: Model Development and Evaluation Study.用于挪威临床文本代码预测的基于变压器的挪威语临床双向编码器表示的特定领域预训练:模型开发与评估研究
JMIR AI. 2025 Aug 25;4:e66153. doi: 10.2196/66153.
7
Tailoring task arithmetic to address bias in models trained on multi-institutional datasets.调整任务算法以解决在多机构数据集上训练的模型中的偏差问题。
J Biomed Inform. 2025 Aug;168:104858. doi: 10.1016/j.jbi.2025.104858. Epub 2025 Jun 8.
8
Genetic determinants of testicular sperm extraction outcomes: insights from a large multicentre study of men with non-obstructive azoospermia.睾丸精子提取结果的遗传决定因素:来自一项针对非梗阻性无精子症男性的大型多中心研究的见解
Hum Reprod Open. 2025 Aug 29;2025(3):hoaf049. doi: 10.1093/hropen/hoaf049. eCollection 2025.
9
Stabilizing machine learning for reproducible and explainable results: A novel validation approach to subject-specific insights.稳定机器学习以获得可重复和可解释的结果:一种针对特定个体见解的新型验证方法。
Comput Methods Programs Biomed. 2025 Jun 21;269:108899. doi: 10.1016/j.cmpb.2025.108899.
10
Sentences, entities, and keyphrases extraction from consumer health forums using multi-task learning.使用多任务学习从消费者健康论坛中提取句子、实体和关键短语。
J Biomed Semantics. 2025 May 6;16(1):8. doi: 10.1186/s13326-025-00329-2.

本文引用的文献

1
PubChem 2023 update.PubChem 2023 更新。
Nucleic Acids Res. 2023 Jan 6;51(D1):D1373-D1380. doi: 10.1093/nar/gkac956.
2
BioBERT: a pre-trained biomedical language representation model for biomedical text mining.BioBERT:一种用于生物医学文本挖掘的预训练生物医学语言表示模型。
Bioinformatics. 2020 Feb 15;36(4):1234-1240. doi: 10.1093/bioinformatics/btz682.
3
MoleculeNet: a benchmark for molecular machine learning.分子网络:分子机器学习的一个基准
Chem Sci. 2017 Oct 31;9(2):513-530. doi: 10.1039/c7sc02664a. eCollection 2018 Jan 14.
4
Reoptimization of MDL keys for use in drug discovery.重新优化用于药物发现的分子描述符语言(MDL)键。
J Chem Inf Comput Sci. 2002 Nov-Dec;42(6):1273-80. doi: 10.1021/ci010132r.