GE Global Research, 2623 Camino Ramon, San Ramon, CA 94583, USA.
IEEE Trans Pattern Anal Mach Intell. 2013 May;35(5):1025-38. doi: 10.1109/TPAMI.2012.189.
In this paper, we consider the problem of learning from multiple related tasks for improved generalization performance by extracting their shared structures. The alternating structure optimization (ASO) algorithm, which couples all tasks using a shared feature representation, has been successfully applied in various multitask learning problems. However, ASO is nonconvex and the alternating algorithm only finds a local solution. We first present an improved ASO formulation (iASO) for multitask learning based on a new regularizer. We then convert iASO, a nonconvex formulation, into a relaxed convex one (rASO). Interestingly, our theoretical analysis reveals that rASO finds a globally optimal solution to its nonconvex counterpart iASO under certain conditions. rASO can be equivalently reformulated as a semidefinite program (SDP), which is, however, not scalable to large datasets. We propose to employ the block coordinate descent (BCD) method and the accelerated projected gradient (APG) algorithm separately to find the globally optimal solution to rASO; we also develop efficient algorithms for solving the key subproblems involved in BCD and APG. The experiments on the Yahoo webpages datasets and the Drosophila gene expression pattern images datasets demonstrate the effectiveness and efficiency of the proposed algorithms and confirm our theoretical analysis.
在本文中,我们考虑通过提取其共享结构来从多个相关任务中学习以提高泛化性能的问题。交替结构优化(ASO)算法通过使用共享特征表示来耦合所有任务,已成功应用于各种多任务学习问题中。但是,ASO 是非凸的,交替算法仅找到局部解。我们首先基于新的正则化项为多任务学习提出了一种改进的 ASO 公式(iASO)。然后,我们将非凸公式 iASO 转换为松弛的凸公式(rASO)。有趣的是,我们的理论分析表明,在某些条件下,rASO 为其非凸对应物 iASO 找到全局最优解。rASO 可以等效地重新表述为半定规划(SDP),但它不适用于大型数据集。我们建议分别使用块坐标下降(BCD)方法和加速投影梯度(APG)算法来找到 rASO 的全局最优解;我们还开发了用于解决 BCD 和 APG 中涉及的关键子问题的有效算法。在 Yahoo 网页数据集和 Drosophila 基因表达模式图像数据集上的实验证明了所提出算法的有效性和效率,并证实了我们的理论分析。