Pavlichin Dmitri S, Weissman Tsachy
Stanford University.
Proc IEEE Int Symp Info Theory. 2016 Jul;2016:580-584. doi: 10.1109/ISIT.2016.7541365. Epub 2016 Aug 11.
We define and characterize the "chained" Kullback-Leibler divergence min (‖) + (‖) minimized over all intermediate distributions and the analogous -fold chained K-L divergence min (‖) + … + (‖) + (‖) minimized over the entire path (,…,). This quantity arises in a large deviations analysis of a Markov chain on the set of types - the Wright-Fisher model of neutral genetic drift: a population with allele distribution produces offspring with allele distribution , which then produce offspring with allele distribution , and so on. The chained divergences enjoy some of the same properties as the K-L divergence (like joint convexity in the arguments) and appear in -step versions of some of the same settings as the K-L divergence (like information projections and a conditional limit theorem). We further characterize the optimal -step "path" of distributions appearing in the definition and apply our findings in a large deviations analysis of the Wright-Fisher process. We make a connection to information geometry via the previously studied continuum limit, where the number of steps tends to infinity, and the limiting path is a geodesic in the Fisher information metric. Finally, we offer a thermodynamic interpretation of the chained divergence (as the rate of operation of an appropriately defined Maxwell's demon) and we state some natural extensions and applications (a -step mutual information and -step maximum likelihood inference). We release code for computing the objects we study.
我们定义并刻画了在所有中间分布上最小化的“链式”库尔贝克 - 莱布勒散度(\min_{\pi}(\pi|\mu) + (\pi|\nu)),以及在整个路径((\pi_1,\ldots,\pi_{n - 1}))上最小化的类似的(n)重链式库尔贝克 - 莱布勒散度(\min_{\pi_1,\ldots,\pi_{n - 1}}(\pi_1|\mu) + \cdots + (\pi_{n - 1}|\pi_n) + (\pi_n|\nu))。这个量出现在类型集上的马尔可夫链的大偏差分析中——中性遗传漂变的赖特 - 费希尔模型:具有等位基因分布(\mu)的种群产生具有等位基因分布(\pi_1)的后代,然后这些后代产生具有等位基因分布(\pi_2)的后代,依此类推。链式散度具有一些与库尔贝克 - 莱布勒散度相同的性质(如在参数中联合凸性),并且出现在一些与库尔贝克 - 莱布勒散度相同设置的(n)步版本中(如信息投影和一个条件极限定理)。我们进一步刻画了定义中出现的最优(n)步分布“路径”,并将我们的发现应用于赖特 - 费希尔过程的大偏差分析。我们通过先前研究的连续极限与信息几何建立联系,其中步数趋于无穷,并且极限路径是费希尔信息度量中的测地线。最后,我们给出了链式散度的热力学解释(作为适当定义的麦克斯韦妖的运行速率),并陈述了一些自然的扩展和应用((n)步互信息和(n)步最大似然推断)。我们发布了用于计算我们所研究对象的代码。