Kulkarni Vaibhav, Mahalunkar Abhijit, Garbinato Benoit, Kelleher John D
Department of Information Systems, UNIL-HEC Lausanne, 1015 Lausanne, Switzerland.
Applied Intelligence Research Center, Technological University Dublin, D08 NF82 Dublin, Ireland.
Entropy (Basel). 2019 Apr 24;21(4):432. doi: 10.3390/e21040432.
We challenge the upper bound of human-mobility predictability that is widely used to corroborate the accuracy of mobility prediction models. We observe that extensions of recurrent-neural network architectures achieve significantly higher prediction accuracy, surpassing this upper bound. Given this discrepancy, the central objective of our work is to show that the methodology behind the estimation of the predictability upper bound is erroneous and identify the reasons behind this discrepancy. In order to explain this anomaly, we shed light on several underlying assumptions that have contributed to this bias. In particular, we highlight the consequences of the assumed Markovian nature of human-mobility on deriving this upper bound on maximum mobility predictability. By using several statistical tests on three real-world mobility datasets, we show that human mobility exhibits scale-invariant long-distance dependencies, contrasting with the initial Markovian assumption. We show that this assumption of exponential decay of information in mobility trajectories, coupled with the inadequate usage of encoding techniques results in entropy inflation, consequently lowering the upper bound on predictability. We highlight that the current upper bound computation methodology based on Fano's inequality tends to overlook the presence of long-range structural correlations inherent to mobility behaviors and we demonstrate its significance using an alternate encoding scheme. We further show the manifestation of not accounting for these dependencies by probing the mutual information decay in mobility trajectories. We expose the systematic bias that culminates into an inaccurate upper bound and further explain as to why the recurrent-neural architectures, designed to handle long-range structural correlations, surpass this upper limit on human mobility predictability.
我们对广泛用于佐证移动性预测模型准确性的人类移动性可预测性上限提出了挑战。我们观察到,递归神经网络架构的扩展实现了显著更高的预测准确性,超过了这个上限。鉴于这种差异,我们工作的核心目标是表明可预测性上限估计背后的方法是错误的,并找出这种差异背后的原因。为了解释这种异常现象,我们阐明了导致这种偏差的几个潜在假设。特别是,我们强调了人类移动性的假设马尔可夫性质对得出最大移动性可预测性上限的影响。通过对三个真实世界移动性数据集进行多项统计测试,我们表明人类移动性呈现出尺度不变的长距离依赖性,这与最初的马尔可夫假设形成对比。我们表明,这种移动性轨迹中信息指数衰减的假设,再加上编码技术使用不足,导致了熵膨胀,从而降低了可预测性上限。我们强调,当前基于法诺不等式的上限计算方法往往忽略了移动性行为固有的长程结构相关性的存在,并且我们使用另一种编码方案证明了其重要性。我们通过探究移动性轨迹中的互信息衰减,进一步展示了不考虑这些依赖性的表现。我们揭示了导致不准确上限的系统偏差,并进一步解释了为什么旨在处理长程结构相关性的递归神经网络架构超过了人类移动性可预测性的这个上限。