Kanamori Takafumi, Osugi Naoya
Tokyo Institute of Technology, 2-12-1 Ookayama, Meguro-ku, Tokyo 152-8552, Japan.
RIKEN AIP, Nihonbashi 1-chome Mitsui Building, 15th floor, 1-4-1 Nihonbashi, Chuo-ku, Tokyo 103-0027, Japan.
Entropy (Basel). 2019 Jul 17;21(7):702. doi: 10.3390/e21070702.
The quality of online services highly depends on the accuracy of the recommendations they can provide to users. Researchers have proposed various similarity measures based on the assumption that similar people like or dislike similar items or people, in order to improve the accuracy of their services. Additionally, statistical models, such as the stochastic block models, have been used to understand network structures. In this paper, we discuss the relationship between similarity-based methods and statistical models using the Bernoulli mixture models and the expectation-maximization (EM) algorithm. The Bernoulli mixture model naturally leads to a completely positive matrix as the similarity matrix. We prove that most of the commonly used similarity measures yield completely positive matrices as the similarity matrix. Based on this relationship, we propose an algorithm to transform the similarity matrix to the Bernoulli mixture model. Such a correspondence provides a statistical interpretation to similarity-based methods. Using this algorithm, we conduct numerical experiments using synthetic data and real-world data provided from an online dating site, and report the efficiency of the recommendation system based on the Bernoulli mixture models.
在线服务的质量高度依赖于它们能够向用户提供的推荐的准确性。研究人员基于相似的人喜欢或不喜欢相似的物品或人的假设,提出了各种相似性度量方法,以提高其服务的准确性。此外,统计模型,如随机块模型,已被用于理解网络结构。在本文中,我们使用伯努利混合模型和期望最大化(EM)算法来讨论基于相似性的方法与统计模型之间的关系。伯努利混合模型自然会产生一个完全正定矩阵作为相似性矩阵。我们证明,大多数常用的相似性度量都会产生完全正定矩阵作为相似性矩阵。基于这种关系,我们提出了一种将相似性矩阵转换为伯努利混合模型的算法。这种对应关系为基于相似性的方法提供了一种统计解释。使用该算法,我们使用合成数据和一个在线约会网站提供的真实数据进行了数值实验,并报告了基于伯努利混合模型的推荐系统的效率。