Wang Wei, Tulyakov Sergey, Sebe Nicu
IEEE Trans Pattern Anal Mach Intell. 2018 Nov;40(11):2569-2582. doi: 10.1109/TPAMI.2018.2810881. Epub 2018 Mar 1.
The mainstream direction in face alignment is now dominated by cascaded regression methods. These methods start from an image with an initial shape and build a set of shape increments based on features with respect to the current estimated shape. These shape increments move the initial shape to the desired location. Despite the advantages of the cascaded methods, they all share two major limitations: (i) shape increments are learned independently from each other in a cascaded manner, (ii) the use of standard generic computer vision features such SIFT, HOG, does not allow these methods to learn problem-specific features. In this work, we propose a novel Recurrent Convolutional Shape Regression (RCSR) method that overcomes these limitations. We formulate the standard cascaded alignment problem as a recurrent process and learn all shape increments jointly, by using a recurrent neural network with a gated recurrent unit. Importantly, by combining a convolutional neural network with a recurrent one we avoid hand-crafted features, widely adopted in the literature and thus we allow the model to learn task-specific features. Besides, we employ the convolutional gated recurrent unit which takes as input the feature tensors instead of flattened feature vectors. Therefore, the spatial structure of the features can be better preserved in the memory of the recurrent neural network. Moreover, both the convolutional and the recurrent neural networks are learned jointly. Experimental evaluation shows that the proposed method has better performance than the state-of-the-art methods, and further supports the importance of learning a single end-to-end model for face alignment.
目前,面部对齐的主流方向由级联回归方法主导。这些方法从具有初始形状的图像开始,并基于相对于当前估计形状的特征构建一组形状增量。这些形状增量将初始形状移动到期望的位置。尽管级联方法有优点,但它们都有两个主要局限性:(i)形状增量是以级联方式相互独立学习的;(ii)使用标准的通用计算机视觉特征,如SIFT、HOG,不允许这些方法学习特定于问题的特征。在这项工作中,我们提出了一种新颖的递归卷积形状回归(RCSR)方法来克服这些局限性。我们将标准的级联对齐问题表述为一个递归过程,并通过使用带有门控递归单元的递归神经网络联合学习所有形状增量。重要的是,通过将卷积神经网络与递归神经网络相结合,我们避免了文献中广泛采用的手工制作特征,从而使模型能够学习特定于任务的特征。此外,我们采用卷积门控递归单元,它将特征张量而不是展平的特征向量作为输入。因此,特征的空间结构可以在递归神经网络的记忆中得到更好的保留。而且,卷积神经网络和递归神经网络都是联合学习的。实验评估表明,所提出的方法比现有方法具有更好的性能,并进一步支持了学习单个端到端模型用于面部对齐的重要性。