Suppr超能文献

核化贝叶斯矩阵分解。

Kernelized Bayesian Matrix Factorization.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2014 Oct;36(10):2047-60. doi: 10.1109/TPAMI.2014.2313125.

Abstract

We extend kernelized matrix factorization with a full-Bayesian treatment and with an ability to work with multiple side information sources expressed as different kernels. Kernels have been introduced to integrate side information about the rows and columns, which is necessary for making out-of-matrix predictions. We discuss specifically binary output matrices but extensions to realvalued matrices are straightforward. We extend the state of the art in two key aspects: (i) A full-conjugate probabilistic formulation of the kernelized matrix factorization enables an efficient variational approximation, whereas full-Bayesian treatments are not computationally feasible in the earlier approaches. (ii) Multiple side information sources are included, treated as different kernels in multiple kernel learning which additionally reveals which side sources are informative. We then show that the framework can also be used for supervised and semi-supervised multilabel classification and multi-output regression, by considering samples and outputs as the domains where matrix factorization operates. Our method outperforms alternatives in predicting drug-protein interactions on two data sets. On multilabel classification, our algorithm obtains the lowest Hamming losses on 10 out of 14 data sets compared to five state-of-the-art multilabel classification algorithms. We finally show that the proposed approach outperforms alternatives in multi-output regression experiments on a yeast cell cycle data set.

摘要

我们通过全贝叶斯处理和处理多个边信息源的能力扩展了核矩阵分解,这些边信息源表示为不同的核。核被引入到矩阵外预测中,以整合关于行和列的边信息,这是必要的。我们特别讨论了二进制输出矩阵,但对实值矩阵的扩展是直接的。我们在两个关键方面扩展了现有技术:(i)核矩阵分解的全共轭概率公式化使有效的变分逼近成为可能,而早期方法中的全贝叶斯处理在计算上是不可行的。(ii)包括多个边信息源,将其视为多内核学习中的不同核,这进一步揭示了哪些边源是信息丰富的。然后,我们通过将样本和输出视为矩阵分解操作的域,表明该框架也可用于监督和半监督多标签分类和多输出回归。我们的方法在两个数据集上预测药物-蛋白质相互作用方面优于其他方法。在多标签分类方面,与五种最先进的多标签分类算法相比,我们的算法在 14 个数据集的 10 个数据集中获得了最低的汉明损失。最后,我们表明,在酵母细胞周期数据集的多输出回归实验中,所提出的方法优于其他方法。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验