Le Lan Charline, Dinh Laurent
Department of Statistics, University of Oxford, Oxford OX1 3LB, UK.
Google Research, Montreal, QC H3B 2Y5, Canada.
Entropy (Basel). 2021 Dec 16;23(12):1690. doi: 10.3390/e23121690.
Thanks to the tractability of their likelihood, several deep generative models show promise for seemingly straightforward but important applications like anomaly detection, uncertainty estimation, and active learning. However, the likelihood values empirically attributed to anomalies conflict with the expectations these proposed applications suggest. In this paper, we take a closer look at the behavior of distribution densities through the lens of reparametrization and show that these quantities carry less meaningful information than previously thought, beyond estimation issues or the curse of dimensionality. We conclude that the use of these likelihoods for anomaly detection relies on strong and implicit hypotheses, and highlight the necessity of explicitly formulating these assumptions for reliable anomaly detection.
由于其似然性的易处理性,几种深度生成模型在诸如异常检测、不确定性估计和主动学习等看似简单但却重要的应用中显示出了前景。然而,根据经验赋予异常的似然值与这些所提出的应用所暗示的期望相冲突。在本文中,我们通过重新参数化的视角更仔细地研究分布密度的行为,并表明这些量所携带的有意义信息比之前认为的要少,这超出了估计问题或维度诅咒的范畴。我们得出结论,将这些似然性用于异常检测依赖于强大且隐含的假设,并强调了为可靠的异常检测明确阐述这些假设的必要性。