Computing Department, Federal University of Ouro Preto, Ouro Preto 35400-000, MG, Brazil.
Department of Informatics, Federal University of Paraná, Curitiba 81531-990, PR, Brazil.
Sensors (Basel). 2019 Jul 5;19(13):2968. doi: 10.3390/s19132968.
Multimodal systems are a workaround to enhance the robustness and effectiveness of biometric systems. A proper multimodal dataset is of the utmost importance to build such systems. The literature presents some multimodal datasets, although, to the best of our knowledge, there are no previous studies combining face, iris/eye, and vital signals such as the Electrocardiogram (ECG). Moreover, there is no methodology to guide the construction and evaluation of a chimeric dataset. Taking that fact into account, we propose to create a chimeric dataset from three modalities in this work: ECG, eye, and face. Based on the criteria, we also propose a generic and systematic protocol imposing constraints for the creation of homogeneous chimeric individuals, which allow us to perform a fair and reproducible benchmark. Moreover, we have proposed a multimodal approach for these modalities based on state-of-the-art deep representations built by convolutional neural networks. We conduct the experiments in the open-world verification mode and on two different scenarios (intra-session and inter-session), using three modalities from two datasets: CYBHi (ECG) and FRGC (eye and face). Our multimodal approach achieves impressive decidability of 7.20 ± 0.18, yielding an almost perfect verification system (i.e., Equal Error Rate (EER) of 0.20% ± 0.06) on the intra-session scenario with unknown data. On the inter-session scenario, we achieve a decidability of 7.78 ± 0.78 and an EER of 0.06% ± 0.06. In summary, these figures represent a gain of over 28% in decidability and a reduction over 11% of the EER on the intra-session scenario for unknown data compared to the best-known unimodal approach. Besides, we achieve an improvement greater than 22% in decidability and an EER reduction over 6% in the inter-session scenario.
多模态系统是增强生物识别系统鲁棒性和有效性的一种方法。构建此类系统最重要的是要有一个合适的多模态数据集。文献中虽然提出了一些多模态数据集,但据我们所知,以前没有研究将面部、虹膜/眼睛和生命信号(如心电图(ECG))结合在一起的多模态数据集。此外,也没有用于指导组合数据集构建和评估的方法。考虑到这一点,我们在这项工作中提出了一种基于三种模态(ECG、眼睛和面部)的组合数据集。基于这些标准,我们还提出了一种通用的系统协议,对创建同构的组合个体施加约束,从而使我们能够进行公平且可重现的基准测试。此外,我们还针对这些模态提出了一种基于深度学习的多模态方法,使用了来自两个数据集(CYBHi(ECG)和 FRGC(眼睛和面部)的三种模态。我们在开放世界验证模式下,在两个不同的场景(会话内和会话间)中进行了实验,我们的多模态方法在未知数据的会话内场景中取得了令人印象深刻的可决策性,准确率为 7.20 ± 0.18,几乎可以达到完美的验证系统(即,等错误率(EER)为 0.20% ± 0.06)。在会话间场景中,我们的可决策性为 7.78 ± 0.78,EER 为 0.06% ± 0.06。总的来说,与最先进的单模态方法相比,在未知数据的会话内场景中,可决策性提高了 28%以上,EER 降低了 11%以上。此外,在会话间场景中,可决策性提高了 22%以上,EER 降低了 6%以上。