Suppr超能文献

多任务学习的算法相关泛化界。

Algorithm-Dependent Generalization Bounds for Multi-Task Learning.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2017 Feb;39(2):227-241. doi: 10.1109/TPAMI.2016.2544314. Epub 2016 Mar 21.

Abstract

Often, tasks are collected for multi-task learning (MTL) because they share similar feature structures. Based on this observation, in this paper, we present novel algorithm-dependent generalization bounds for MTL by exploiting the notion of algorithmic stability. We focus on the performance of one particular task and the average performance over multiple tasks by analyzing the generalization ability of a common parameter that is shared in MTL. When focusing on one particular task, with the help of a mild assumption on the feature structures, we interpret the function of the other tasks as a regularizer that produces a specific inductive bias. The algorithm for learning the common parameter, as well as the predictor, is thereby uniformly stable with respect to the domain of the particular task and has a generalization bound with a fast convergence rate of order O(1/n), where n is the sample size of the particular task. When focusing on the average performance over multiple tasks, we prove that a similar inductive bias exists under certain conditions on the feature structures. Thus, the corresponding algorithm for learning the common parameter is also uniformly stable with respect to the domains of the multiple tasks, and its generalization bound is of the order O(1/T), where T is the number of tasks. These theoretical analyses naturally show that the similarity of feature structures in MTL will lead to specific regularizations for predicting, which enables the learning algorithms to generalize fast and correctly from a few examples.

摘要

通常,多任务学习(MTL)会收集任务,因为它们共享相似的特征结构。基于这一观察,本文通过利用算法稳定性的概念,为 MTL 提出了新颖的算法相关泛化界。我们通过分析 MTL 中共享的通用参数的泛化能力,关注一个特定任务的性能和多个任务的平均性能。当关注一个特定任务时,在特征结构的一个温和假设的帮助下,我们将其他任务的功能解释为一个正则化器,它产生特定的归纳偏差。用于学习通用参数的算法以及预测器因此相对于特定任务的域是一致稳定的,并且具有快速收敛速度为 O(1/n)的泛化界,其中 n 是特定任务的样本大小。当关注多个任务的平均性能时,我们证明在特征结构的某些条件下存在类似的归纳偏差。因此,用于学习通用参数的相应算法相对于多个任务的域也是一致稳定的,其泛化界为 O(1/T),其中 T 是任务的数量。这些理论分析自然表明,MTL 中的特征结构相似性将导致预测的特定正则化,从而使学习算法能够从少数示例中快速而正确地进行泛化。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验