Suppr超能文献

通过迁移成分分析实现领域自适应。

Domain adaptation via transfer component analysis.

作者信息

Pan Sinno Jialin, Tsang Ivor W, Kwok James T, Yang Qiang

机构信息

Institute of Infocomm Research, 138632, Singapore.

出版信息

IEEE Trans Neural Netw. 2011 Feb;22(2):199-210. doi: 10.1109/TNN.2010.2091281. Epub 2010 Nov 18.

Abstract

Domain adaptation allows knowledge from a source domain to be transferred to a different but related target domain. Intuitively, discovering a good feature representation across domains is crucial. In this paper, we first propose to find such a representation through a new learning method, transfer component analysis (TCA), for domain adaptation. TCA tries to learn some transfer components across domains in a reproducing kernel Hilbert space using maximum mean miscrepancy. In the subspace spanned by these transfer components, data properties are preserved and data distributions in different domains are close to each other. As a result, with the new representations in this subspace, we can apply standard machine learning methods to train classifiers or regression models in the source domain for use in the target domain. Furthermore, in order to uncover the knowledge hidden in the relations between the data labels from the source and target domains, we extend TCA in a semisupervised learning setting, which encodes label information into transfer components learning. We call this extension semisupervised TCA. The main contribution of our work is that we propose a novel dimensionality reduction framework for reducing the distance between domains in a latent space for domain adaptation. We propose both unsupervised and semisupervised feature extraction approaches, which can dramatically reduce the distance between domain distributions by projecting data onto the learned transfer components. Finally, our approach can handle large datasets and naturally lead to out-of-sample generalization. The effectiveness and efficiency of our approach are verified by experiments on five toy datasets and two real-world applications: cross-domain indoor WiFi localization and cross-domain text classification.

摘要

域适应允许将源域中的知识转移到一个不同但相关的目标域。直观地说,跨域发现一个良好的特征表示至关重要。在本文中,我们首先提出通过一种新的学习方法——转移成分分析(TCA)来找到这样一种表示,用于域适应。TCA试图在再生核希尔伯特空间中使用最大均值差异来学习一些跨域的转移成分。在由这些转移成分所跨越的子空间中,数据属性得以保留,并且不同域中的数据分布彼此接近。因此,利用这个子空间中的新表示,我们可以应用标准机器学习方法在源域中训练分类器或回归模型,以便在目标域中使用。此外,为了揭示隐藏在源域和目标域数据标签之间关系中的知识,我们在半监督学习设置下扩展了TCA,将标签信息编码到转移成分学习中。我们将这种扩展称为半监督TCA。我们工作的主要贡献在于,我们提出了一种新颖的降维框架,用于在潜在空间中减小域之间的距离以实现域适应。我们提出了无监督和半监督特征提取方法,通过将数据投影到学习到的转移成分上,可以显著减小域分布之间的距离。最后,我们的方法可以处理大型数据集,并自然地实现样本外泛化。我们方法的有效性和效率通过在五个玩具数据集和两个实际应用上的实验得到了验证:跨域室内WiFi定位和跨域文本分类。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验