Mossel Elchanan, Ohannessian Mesrob I
Department of Mathematics, Massachusetts Institute of Technology, Cambridge, MA 02142, USA.
Toyota Technological Institute at Chicago, Chicago, IL 60637, USA.
Entropy (Basel). 2019 Jan 2;21(1):28. doi: 10.3390/e21010028.
This paper shows that one cannot learn the probability of rare events without imposing further structural assumptions. The event of interest is that of obtaining an outcome outside the coverage of an i.i.d. sample from a discrete distribution. The probability of this event is referred to as the "missing mass". The impossibility result can then be stated as: the missing mass is not distribution-free learnable in relative error. The proof is semi-constructive and relies on a coupling argument using a dithered geometric distribution. Via a reduction, this impossibility also extends to both discrete and continuous tail estimation. These results formalize the folklore that in order to predict rare events without restrictive modeling, one necessarily needs distributions with "heavy tails".
本文表明,如果不施加进一步的结构假设,就无法学习罕见事件的概率。感兴趣的事件是从离散分布中获得独立同分布样本覆盖范围之外的结果。该事件的概率被称为“缺失质量”。然后,不可能性结果可以表述为:缺失质量在相对误差方面不是无分布可学习的。证明是半构造性的,依赖于使用抖动几何分布的耦合论证。通过归约,这种不可能性也扩展到离散和连续尾部估计。这些结果将一种普遍观念形式化,即要在没有严格建模的情况下预测罕见事件,必然需要具有“重尾”的分布。