Łazęcka Małgorzata, Mielniczuk Jan
Institute of Computer Science, Polish Academy of Sciences, Jana Kazimierza 5, 01-248 Warsaw, Poland.
Faculty of Mathematics and Information Science, Warsaw University of Technology, Koszykowa 75, 00-662 Warsaw, Poland.
Entropy (Basel). 2020 Aug 31;22(9):974. doi: 10.3390/e22090974.
We consider a nonparametric Generative Tree Model and discuss a problem of selecting active predictors for the response in such scenario. We investigated two popular information-based selection criteria: Conditional Infomax Feature Extraction (CIFE) and Joint Mutual information (JMI), which are both derived as approximations of Conditional Mutual Information (CMI) criterion. We show that both criteria CIFE and JMI may exhibit different behavior from CMI, resulting in different orders in which predictors are chosen in variable selection process. Explicit formulae for CMI and its two approximations in the generative tree model are obtained. As a byproduct, we establish expressions for an entropy of a multivariate gaussian mixture and its mutual information with mixing distribution.
我们考虑一种非参数生成树模型,并讨论在这种情况下为响应选择主动预测变量的问题。我们研究了两种流行的基于信息的选择标准:条件最大信息特征提取(CIFE)和联合互信息(JMI),它们都是作为条件互信息(CMI)标准的近似值推导出来的。我们表明,CIFE和JMI这两个标准可能表现出与CMI不同的行为,导致在变量选择过程中选择预测变量的顺序不同。得到了生成树模型中CMI及其两个近似值的显式公式。作为副产品,我们建立了多元高斯混合的熵及其与混合分布的互信息的表达式。