Pham Gia-Thuy, Boyer Rémy, Nielsen Frank
Laboratory of Signals and Systems (L2S), Department of Signals and Statistics, University of Paris-Sud, 91400 Orsay, France.
Computer Science Department LIX, École Polytechnique, 91120 Palaiseau, France.
Entropy (Basel). 2018 Mar 17;20(3):203. doi: 10.3390/e20030203.
Evaluating the performance of Bayesian classification in a high-dimensional random tensor is a fundamental problem, usually difficult and under-studied. In this work, we consider two Signal to Noise Ratio (SNR)-based binary classification problems of interest. Under the alternative hypothesis, i.e., for a non-zero SNR, the observed signals are either a noisy rank- tensor admitting a -order Canonical Polyadic Decomposition (CPD) with large factors of size N q × R , i.e., for 1 ≤ q ≤ Q , where R , N q → ∞ with R 1 / q / N q converge towards a finite constant or a noisy tensor admitting TucKer Decomposition (TKD) of multilinear ( M 1 , … , M Q ) -rank with large factors of size N q × M q , i.e., for 1 ≤ q ≤ Q , where N q , M q → ∞ with M q / N q converge towards a finite constant. The classification of the random entries (coefficients) of the core tensor in the CPD/TKD is hard to study since the exact derivation of the minimal Bayes' error probability is mathematically intractable. To circumvent this difficulty, the Chernoff Upper Bound (CUB) for larger SNR and the Fisher information at low SNR are derived and studied, based on information geometry theory. The tightest CUB is reached for the value minimizing the error exponent, denoted by s ⋆ . In general, due to the asymmetry of the -divergence, the Bhattacharyya Upper Bound (BUB) (that is, the Chernoff Information calculated at s ⋆ = 1 / 2 ) cannot solve this problem effectively. As a consequence, we rely on a costly numerical optimization strategy to find s ⋆ . However, thanks to powerful random matrix theory tools, a simple analytical expression of s ⋆ is provided with respect to the Signal to Noise Ratio (SNR) in the two schemes considered. This work shows that the BUB is the tightest bound at low SNRs. However, for higher SNRs, the latest property is no longer true.
评估贝叶斯分类在高维随机张量中的性能是一个基本问题,通常既困难又研究不足。在这项工作中,我们考虑了两个基于信噪比(SNR)的二元分类问题。在备择假设下,即对于非零信噪比,观测信号要么是一个噪声秩张量,它允许具有大小为(N_q\times R)的大因子的(q)阶典范多adic分解(CPD),即对于(1\leq q\leq Q),其中(R),(N_q\rightarrow\infty)且(R^{1/q}/N_q)收敛于一个有限常数;要么是一个噪声张量,它允许具有多线性((M_1,\ldots,M_Q))秩且因子大小为(N_q\times M_q)的塔克分解(TKD),即对于(1\leq q\leq Q),其中(N_q),(M_q\rightarrow\infty)且(M_q/N_q)收敛于一个有限常数。CPD/TKD中核心张量的随机元素(系数)的分类很难研究,因为最小贝叶斯错误概率的精确推导在数学上是难以处理的。为了规避这个困难,基于信息几何理论,推导并研究了高信噪比下的切尔诺夫上界(CUB)和低信噪比下的费希尔信息。通过最小化误差指数的值(s^\star)可得到最紧的CUB。一般来说,由于(\alpha)散度的不对称性,巴塔查里亚上界(BUB)(即(s^\star = 1/2)时计算的切尔诺夫信息)不能有效地解决这个问题。因此,我们依靠一种代价高昂的数值优化策略来找到(s^\star)。然而,多亏了强大的随机矩阵理论工具,在所考虑的两种方案中,针对信噪比(SNR)给出了(s^\star)的一个简单解析表达式。这项工作表明,BUB在低信噪比下是最紧的界。然而,对于更高的信噪比,上述性质不再成立。