IEEE Trans Image Process. 2017 Dec;26(12):5590-5602. doi: 10.1109/TIP.2017.2736419. Epub 2017 Aug 4.
Topic models [e.g., probabilistic latent semantic analysis, latent Dirichlet allocation (LDA), and supervised LDA] have been widely used for segmenting imagery. However, these models are confined to crisp segmentation, forcing a visual word (i.e., an image patch) to belong to one and only one topic. Yet, there are many images in which some regions cannot be assigned a crisp categorical label (e.g., transition regions between a foggy sky and the ground or between sand and water at a beach). In these cases, a visual word is best represented with partial memberships across multiple topics. To address this, we present a partial membership LDA (PM-LDA) model and an associated parameter estimation algorithm. This model can be useful for imagery, where a visual word may be a mixture of multiple topics. Experimental results on visual and sonar imagery show that PM-LDA can produce both crisp and soft semantic image segmentations; a capability previous topic modeling methods do not have.
主题模型(例如概率潜在语义分析、潜在狄利克雷分配(LDA)和监督 LDA)已被广泛用于图像分割。然而,这些模型仅限于清晰的分割,迫使一个视觉词(即图像补丁)只能属于一个且仅属于一个主题。然而,有许多图像,其中一些区域不能被赋予清晰的类别标签(例如,在有雾的天空和地面之间或在海滩上的沙和水之间的过渡区域)。在这些情况下,最好使用多个主题的部分成员来表示视觉词。为了解决这个问题,我们提出了一种部分成员 LDA(PM-LDA)模型和一种相关的参数估计算法。对于图像来说,这个模型可能是有用的,其中一个视觉词可能是多个主题的混合。对视觉和超声图像的实验结果表明,PM-LDA 可以产生清晰和柔和的语义图像分割;这是以前的主题建模方法所没有的能力。