Gajowniczek Krzysztof, Liang Yitao, Friedman Tal, Ząbkowski Tomasz, Van den Broeck Guy
Department of Artificial Intelligence, Institute of Information Technology, Warsaw University of Life Sciences-SGGW, 02-776 Warsaw, Poland.
Computer Science Department, University of California, Los Angeles, CA 90095, USA.
Entropy (Basel). 2020 Mar 14;22(3):334. doi: 10.3390/e22030334.
The increasing size of modern datasets combined with the difficulty of obtaining real label information (e.g., class) has made semi-supervised learning a problem of considerable practical importance in modern data analysis. Semi-supervised learning is supervised learning with additional information on the distribution of the examples or, simultaneously, an extension of unsupervised learning guided by some constraints. In this article we present a methodology that bridges between artificial neural network output vectors and logical constraints. In order to do this, we present a semantic loss function and a generalized entropy loss function (Rényi entropy) that capture how close the neural network is to satisfying the constraints on its output. Our methods are intended to be generally applicable and compatible with any feedforward neural network. Therefore, the semantic loss and generalized entropy loss are simply a regularization term that can be directly plugged into an existing loss function. We evaluate our methodology over an artificially simulated dataset and two commonly used benchmark datasets which are MNIST and Fashion-MNIST to assess the relation between the analyzed loss functions and the influence of the various input and tuning parameters on the classification accuracy. The experimental evaluation shows that both losses effectively guide the learner to achieve (near-) state-of-the-art results on semi-supervised multiclass classification.
现代数据集规模的不断增大,再加上获取真实标签信息(如类别)的难度,使得半监督学习在现代数据分析中成为一个具有相当实际重要性的问题。半监督学习是带有关于示例分布的额外信息的监督学习,或者同时是由某些约束引导的无监督学习的扩展。在本文中,我们提出一种在人工神经网络输出向量与逻辑约束之间架起桥梁的方法。为了做到这一点,我们提出一种语义损失函数和一种广义熵损失函数(雷尼熵),它们能够捕捉神经网络在多大程度上接近满足其输出上的约束。我们的方法旨在具有普遍适用性,并与任何前馈神经网络兼容。因此,语义损失和广义熵损失仅仅是一个正则化项,可以直接插入到现有的损失函数中。我们在一个人工模拟数据集以及两个常用的基准数据集(MNIST和Fashion-MNIST)上评估我们的方法,以评估所分析的损失函数之间的关系以及各种输入和调优参数对分类准确率的影响。实验评估表明,这两种损失都有效地引导学习者在半监督多类分类上取得(接近)当前最优的结果。