IEEE Trans Neural Netw Learn Syst. 2015 Dec;26(12):3034-44. doi: 10.1109/TNNLS.2015.2401595. Epub 2015 Feb 26.
In practical machine learning applications, human instruction is indispensable for model construction. To utilize the precious labeling effort effectively, active learning queries the user with selective sampling in an interactive way. Traditional active learning techniques merely focus on the unlabeled data set under a unidirectional exploration framework and suffer from model deterioration in the presence of noise. To address this problem, this paper proposes a novel bidirectional active learning algorithm that explores into both unlabeled and labeled data sets simultaneously in a two-way process. For the acquisition of new knowledge, forward learning queries the most informative instances from unlabeled data set. For the introspection of learned knowledge, backward learning detects the most suspiciously unreliable instances within the labeled data set. Under the two-way exploration framework, the generalization ability of the learning model can be greatly improved, which is demonstrated by the encouraging experimental results.
在实际的机器学习应用中,模型构建离不开人类的指导。为了有效利用宝贵的标注资源,主动学习通过交互式有选择地抽样向用户查询。传统的主动学习技术仅仅关注单向探索框架下的未标注数据集,并且容易受到噪声的影响导致模型性能下降。为了解决这个问题,本文提出了一种新颖的双向主动学习算法,该算法通过双向过程同时探索未标注数据集和标注数据集。为了获取新知识,前向学习从未标注数据集中查询最具信息量的实例。为了检查学到的知识,后向学习在标注数据集中检测最可疑的不可靠实例。在双向探索框架下,学习模型的泛化能力可以得到极大的提高,实验结果令人鼓舞。