Ebrahimi Mohammadreza, Chai Yidong, Zhang Hao Helen, Chen Hsinchun
IEEE Trans Pattern Anal Mach Intell. 2023 Feb;45(2):1862-1875. doi: 10.1109/TPAMI.2022.3163338. Epub 2023 Jan 6.
Learning predictive models in new domains with scarce training data is a growing challenge in modern supervised learning scenarios. This incentivizes developing domain adaptation methods that leverage the knowledge in known domains (source) and adapt to new domains (target) with a different probability distribution. This becomes more challenging when the source and target domains are in heterogeneous feature spaces, known as heterogeneous domain adaptation (HDA). While most HDA methods utilize mathematical optimization to map source and target data to a common space, they suffer from low transferability. Neural representations have proven to be more transferable; however, they are mainly designed for homogeneous environments. Drawing on the theory of domain adaptation, we propose a novel framework, Heterogeneous Adversarial Neural Domain Adaptation (HANDA), to effectively maximize the transferability in heterogeneous environments. HANDA conducts feature and distribution alignment in a unified neural network architecture and achieves domain invariance through adversarial kernel learning. Three experiments were conducted to evaluate the performance against the state-of-the-art HDA methods on major image and text e-commerce benchmarks. HANDA shows statistically significant improvement in predictive performance. The practical utility of HANDA was shown in real-world dark web online markets. HANDA is an important step towards successful domain adaptation in e-commerce applications.
在现代监督学习场景中,利用稀缺训练数据在新领域学习预测模型是一个日益严峻的挑战。这促使人们开发领域自适应方法,这些方法利用已知领域(源域)中的知识,并适应具有不同概率分布的新领域(目标域)。当源域和目标域处于异构特征空间时,这一挑战变得更加艰巨,这种情况被称为异构域自适应(HDA)。虽然大多数HDA方法利用数学优化将源域和目标域数据映射到一个公共空间,但它们的可迁移性较低。事实证明,神经表征具有更高的可迁移性;然而,它们主要是为同构环境设计的。借鉴域自适应理论,我们提出了一种新颖的框架——异构对抗神经域自适应(HANDA),以有效最大化异构环境中的可迁移性。HANDA在统一的神经网络架构中进行特征和分布对齐,并通过对抗核学习实现域不变性。我们进行了三项实验,以在主要的图像和文本电子商务基准上评估其相对于最先进的HDA方法的性能。HANDA在预测性能上显示出具有统计学意义的提升。HANDA在现实世界的暗网在线市场中展示了其实际效用。HANDA是电子商务应用中成功实现域自适应的重要一步。