Zhang Wenhao, Smith Melvyn L, Smith Lyndon N, Farooq Abdul
J Opt Soc Am A Opt Image Sci Vis. 2016 Mar;33(3):333-44. doi: 10.1364/JOSAA.33.000333.
This paper seeks to compare encoded features from both two-dimensional (2D) and three-dimensional (3D) face images in order to achieve automatic gender recognition with high accuracy and robustness. The Fisher vector encoding method is employed to produce 2D, 3D, and fused features with escalated discriminative power. For 3D face analysis, a two-source photometric stereo (PS) method is introduced that enables 3D surface reconstructions with accurate details as well as desirable efficiency. Moreover, a 2D+3D imaging device, taking the two-source PS method as its core, has been developed that can simultaneously gather color images for 2D evaluations and PS images for 3D analysis. This system inherits the superior reconstruction accuracy from the standard (three or more light) PS method but simplifies the reconstruction algorithm as well as the hardware design by only requiring two light sources. It also offers great potential for facilitating human computer interaction by being accurate, cheap, efficient, and nonintrusive. Ten types of low-level 2D and 3D features have been experimented with and encoded for Fisher vector gender recognition. Evaluations of the Fisher vector encoding method have been performed on the FERET database, Color FERET database, LFW database, and FRGCv2 database, yielding 97.7%, 98.0%, 92.5%, and 96.7% accuracy, respectively. In addition, the comparison of 2D and 3D features has been drawn from a self-collected dataset, which is constructed with the aid of the 2D+3D imaging device in a series of data capture experiments. With a variety of experiments and evaluations, it can be proved that the Fisher vector encoding method outperforms most state-of-the-art gender recognition methods. It has also been observed that 3D features reconstructed by the two-source PS method are able to further boost the Fisher vector gender recognition performance, i.e., up to a 6% increase on the self-collected database.
本文旨在比较二维(2D)和三维(3D)面部图像的编码特征,以实现高精度和鲁棒性的自动性别识别。采用Fisher向量编码方法来生成具有增强判别力的2D、3D和融合特征。对于3D面部分析,引入了一种双源光度立体(PS)方法,该方法能够以精确的细节和理想的效率进行3D表面重建。此外,还开发了一种以双源PS方法为核心的2D + 3D成像设备,它可以同时采集用于2D评估的彩色图像和用于3D分析的PS图像。该系统继承了标准(三个或更多光源)PS方法的卓越重建精度,但通过仅需两个光源简化了重建算法以及硬件设计。它还具有准确、廉价、高效且非侵入性的特点,在促进人机交互方面具有巨大潜力。已对十种类型的低级2D和3D特征进行了实验,并对其进行编码以用于Fisher向量性别识别。在FERET数据库、彩色FERET数据库、LFW数据库和FRGCv2数据库上对Fisher向量编码方法进行了评估,准确率分别为97.7%、98.0%、92.5%和96.7%。此外,从一个自行收集的数据集中得出了2D和3D特征的比较结果,该数据集是在一系列数据采集实验中借助2D + 3D成像设备构建的。通过各种实验和评估,可以证明Fisher向量编码方法优于大多数最先进的性别识别方法。还观察到,通过双源PS方法重建的3D特征能够进一步提高Fisher向量性别识别性能,即在自行收集的数据库上提高了6%。