IEEE Trans Neural Netw Learn Syst. 2017 Jul;28(7):1682-1695. doi: 10.1109/TNNLS.2016.2538282. Epub 2016 Apr 20.
There are plenty of classification methods that perform well when training and testing data are drawn from the same distribution. However, in real applications, this condition may be violated, which causes degradation of classification accuracy. Domain adaptation is an effective approach to address this problem. In this paper, we propose a general domain adaptation framework from the perspective of prediction reweighting, from which a novel approach is derived. Different from the major domain adaptation methods, our idea is to reweight predictions of the training classifier on testing data according to their signed distance to the domain separator, which is a classifier that distinguishes training data (from source domain) and testing data (from target domain). We then propagate the labels of target instances with larger weights to ones with smaller weights by introducing a manifold regularization method. It can be proved that our reweighting scheme effectively brings the source and target domains closer to each other in an appropriate sense, such that classification in target domain becomes easier. The proposed method can be implemented efficiently by a simple two-stage algorithm, and the target classifier has a closed-form solution. The effectiveness of our approach is verified by the experiments on artificial datasets and two standard benchmarks, a visual object recognition task and a cross-domain sentiment analysis of text. Experimental results demonstrate that our method is competitive with the state-of-the-art domain adaptation algorithms.
有许多分类方法在训练和测试数据来自同一分布时表现良好。然而,在实际应用中,这种情况可能会被违反,导致分类精度下降。域自适应是解决这个问题的一种有效方法。在本文中,我们从预测重加权的角度提出了一个通用的域自适应框架,并从中推导出一种新的方法。与主要的域自适应方法不同,我们的想法是根据训练分类器在测试数据上的预测的符号距离到域分离器(一个区分训练数据(来自源域)和测试数据(来自目标域)的分类器)对其进行重新加权。然后,我们通过引入流形正则化方法,将具有较大权重的目标实例的标签传播到具有较小权重的目标实例的标签。可以证明,我们的重加权方案有效地将源域和目标域在适当的意义上拉近,从而使目标域中的分类变得更加容易。所提出的方法可以通过一个简单的两阶段算法高效地实现,并且目标分类器具有闭式解。我们的方法在人工数据集和两个标准基准上的实验验证了其有效性,包括视觉目标识别任务和文本的跨域情感分析。实验结果表明,我们的方法与最先进的域自适应算法具有竞争力。