Philipsen Mark Philip, Dueholm Jacob Velling, Jørgensen Anders, Escalera Sergio, Moeslund Thomas Baltzer
Media Technology, Aalborg University, 9000 Aalborg, Denmark.
IHFood, Carsten Niebuhrs Gade 10, 2. tv., 1577 Copenhagen, Denmark.
Sensors (Basel). 2018 Jan 3;18(1):117. doi: 10.3390/s18010117.
We present a pattern recognition framework for semantic segmentation of visual structures, that is, multi-class labelling at pixel level, and apply it to the task of segmenting organs in the eviscerated viscera from slaughtered poultry in RGB-D images. This is a step towards replacing the current strenuous manual inspection at poultry processing plants. Features are extracted from feature maps such as activation maps from a convolutional neural network (CNN). A random forest classifier assigns class probabilities, which are further refined by utilizing context in a conditional random field. The presented method is compatible with both 2D and 3D features, which allows us to explore the value of adding 3D and CNN-derived features. The dataset consists of 604 RGB-D images showing 151 unique sets of eviscerated viscera from four different perspectives. A mean Jaccard index of 78.11 % is achieved across the four classes of organs by using features derived from 2D, 3D and a CNN, compared to 74.28 % using only basic 2D image features.
我们提出了一种用于视觉结构语义分割的模式识别框架,即像素级的多类标注,并将其应用于从RGB-D图像中分割屠宰家禽的去内脏内脏器官的任务。这是朝着取代家禽加工厂目前繁重的人工检查迈出的一步。特征是从特征图中提取的,例如来自卷积神经网络(CNN)的激活图。随机森林分类器分配类别概率,并通过在条件随机场中利用上下文进一步细化。所提出的方法与2D和3D特征都兼容,这使我们能够探索添加3D和CNN衍生特征的价值。该数据集由604张RGB-D图像组成,从四个不同视角展示了151组独特的去内脏内脏器官。通过使用从2D、3D和CNN衍生的特征,在四类器官上实现了平均Jaccard指数为78.11%,而仅使用基本2D图像特征时为74.28%。