Bruni Vittoria, Cardinali Maria Lucia, Vitulano Domenico
Department of Basic and Applied Sciences for Engineering, Sapienza Rome University, Via Antonio Scarpa 16, 00161 Rome, Italy.
Istituto per le Applicazioni del Calcolo, Via dei Taurini 19, 00185 Rome, Italy.
Entropy (Basel). 2022 Feb 13;24(2):269. doi: 10.3390/e24020269.
The minimun description length (MDL) is a powerful criterion for model selection that is gaining increasing interest from both theorists and practicioners. It allows for automatic selection of the best model for representing data without having a priori information about them. It simply uses both data and model complexity, selecting the model that provides the least coding length among a predefined set of models. In this paper, we briefly review the basic ideas underlying the MDL criterion and its applications in different fields, with particular reference to the dimension reduction problem. As an example, the role of MDL in the selection of the best principal components in the well known PCA is investigated.
最小描述长度(MDL)是一种强大的模型选择标准,正受到理论家和实践者越来越多的关注。它允许在没有关于数据的先验信息的情况下自动选择用于表示数据的最佳模型。它仅使用数据和模型复杂度,在一组预定义的模型中选择提供最短编码长度的模型。在本文中,我们简要回顾了MDL标准背后的基本思想及其在不同领域的应用,特别提及了降维问题。作为一个例子,研究了MDL在著名的主成分分析(PCA)中选择最佳主成分时的作用。