Bailey James, Houle Michael E, Ma Xingjun
School of Computing and Information Systems, The University of Melbourne, Melbourne, VIC 3010, Australia.
School of Computer Science, Fudan University, Shanghai 200437, China.
Entropy (Basel). 2022 Aug 30;24(9):1220. doi: 10.3390/e24091220.
Properties of data distributions can be assessed at both global and local scales. At a highly localized scale, a fundamental measure is the local intrinsic dimensionality (LID), which assesses growth rates of the cumulative distribution function within a restricted neighborhood and characterizes properties of the geometry of a local neighborhood. In this paper, we explore the connection of LID to other well known measures for complexity assessment and comparison, namely, entropy and statistical distances or divergences. In an asymptotic context, we develop analytical new expressions for these quantities in terms of LID. This reveals the fundamental nature of LID as a building block for characterizing and comparing data distributions, opening the door to new methods for distributional analysis at a local scale.
数据分布的属性可以在全局和局部尺度上进行评估。在高度局部化的尺度上,一个基本的度量是局部内在维度(LID),它评估受限邻域内累积分布函数的增长率,并表征局部邻域几何结构的属性。在本文中,我们探讨了LID与其他用于复杂性评估和比较的知名度量之间的联系,即熵以及统计距离或散度。在渐近背景下,我们根据LID推导出了这些量的新解析表达式。这揭示了LID作为表征和比较数据分布的基石的基本性质,为局部尺度上的分布分析开辟了新方法。