Févotte Cédric, Bertin Nancy, Durrieu Jean-Louis
CNRS-TELECOM ParisTech, 75014 Paris, France.
Neural Comput. 2009 Mar;21(3):793-830. doi: 10.1162/neco.2008.04-08-771.
This letter presents theoretical, algorithmic, and experimental results about nonnegative matrix factorization (NMF) with the Itakura-Saito (IS) divergence. We describe how IS-NMF is underlaid by a well-defined statistical model of superimposed gaussian components and is equivalent to maximum likelihood estimation of variance parameters. This setting can accommodate regularization constraints on the factors through Bayesian priors. In particular, inverse-gamma and gamma Markov chain priors are considered in this work. Estimation can be carried out using a space-alternating generalized expectation-maximization (SAGE) algorithm; this leads to a novel type of NMF algorithm, whose convergence to a stationary point of the IS cost function is guaranteed. We also discuss the links between the IS divergence and other cost functions used in NMF, in particular, the Euclidean distance and the generalized Kullback-Leibler (KL) divergence. As such, we describe how IS-NMF can also be performed using a gradient multiplicative algorithm (a standard algorithm structure in NMF) whose convergence is observed in practice, though not proven. Finally, we report a furnished experimental comparative study of Euclidean-NMF, KL-NMF, and IS-NMF algorithms applied to the power spectrogram of a short piano sequence recorded in real conditions, with various initializations and model orders. Then we show how IS-NMF can successfully be employed for denoising and upmix (mono to stereo conversion) of an original piece of early jazz music. These experiments indicate that IS-NMF correctly captures the semantics of audio and is better suited to the representation of music signals than NMF with the usual Euclidean and KL costs.
本文介绍了关于采用伊塔库拉 - 斋藤(IS)散度的非负矩阵分解(NMF)的理论、算法和实验结果。我们描述了IS - NMF是如何基于叠加高斯分量的明确定义的统计模型,并等同于方差参数的最大似然估计。这种设置可以通过贝叶斯先验来适应对因子的正则化约束。特别是,本文考虑了逆伽马和伽马马尔可夫链先验。估计可以使用空间交替广义期望最大化(SAGE)算法进行;这导致了一种新型的NMF算法,其收敛到IS代价函数的驻点是有保证的。我们还讨论了IS散度与NMF中使用的其他代价函数之间的联系,特别是欧几里得距离和广义库尔贝克 - 莱布勒(KL)散度。因此,我们描述了如何也可以使用梯度乘法算法(NMF中的一种标准算法结构)来执行IS - NMF,尽管其收敛性在实践中得到了观察,但尚未得到证明。最后,我们报告了一项关于将欧几里得 - NMF、KL - NMF和IS - NMF算法应用于在实际条件下录制的短钢琴序列的功率谱图的实验比较研究,包括各种初始化和模型阶数。然后我们展示了IS - NMF如何能够成功地用于对一段早期爵士音乐原曲进行去噪和上混(单声道到立体声转换)。这些实验表明,与具有通常欧几里得和KL代价的NMF相比,IS - NMF能够正确捕捉音频的语义,并且更适合于音乐信号的表示。