Hartmann Tim Johannes, Hartmann Julien Ben Joachim, Friebe-Hoffmann Ulrike, Lato Christiane, Janni Wolfgang, Lato Krisztian
Universitäts-Hautklinik Tübingen, Tübingen, Germany.
Universitätsfrauenklinik Ulm, Ulm, Germany.
Geburtshilfe Frauenheilkd. 2022 Jul 21;82(9):955-969. doi: 10.1055/a-1866-2943. eCollection 2022 Sep.
To date, most ways to perform facial expression recognition rely on two-dimensional images, advanced approaches with three-dimensional data exist. These however demand stationary apparatuses and thus lack portability and possibilities to scale deployment. As human emotions, intent and even diseases may condense in distinct facial expressions or changes therein, the need for a portable yet capable solution is signified. Due to the superior informative value of three-dimensional data on facial morphology and because certain syndromes find expression in specific facial dysmorphisms, a solution should allow portable acquisition of true three-dimensional facial scans in real time. In this study we present a novel solution for the three-dimensional acquisition of facial geometry data and the recognition of facial expressions from it. The new technology presented here only requires the use of a smartphone or tablet with an integrated TrueDepth camera and enables real-time acquisition of the geometry and its categorization into distinct facial expressions. Our approach consisted of two parts: First, training data was acquired by asking a collective of 226 medical students to adopt defined facial expressions while their current facial morphology was captured by our specially developed app running on iPads, placed in front of the students. In total, the list of the facial expressions to be shown by the participants consisted of "disappointed", "stressed", "happy", "sad" and "surprised". Second, the data were used to train a self-normalizing neural network. A set of all factors describing the current facial expression at a time is referred to as "snapshot". In total, over half a million snapshots were recorded in the study. Ultimately, the network achieved an overall accuracy of 80.54% after 400 epochs of training. In test, an overall accuracy of 81.15% was determined. Recall values differed by the category of a snapshot and ranged from 74.79% for "stressed" to 87.61% for "happy". Precision showed similar results, whereas "sad" achieved the lowest value at 77.48% and "surprised" the highest at 86.87%. With the present work it can be demonstrated that respectable results can be achieved even when using data sets with some challenges. Through various measures, already incorporated into an optimized version of our app, it is to be expected that the training results can be significantly improved and made more precise in the future. Currently a follow-up study with the new version of our app that encompasses the suggested alterations and adaptions, is being conducted. We aim to build a large and open database of facial scans not only for facial expression recognition but to perform disease recognition and to monitor diseases' treatment progresses.
迄今为止,大多数面部表情识别方法都依赖二维图像,虽然也存在使用三维数据的先进方法。然而,这些方法需要固定设备,因此缺乏便携性和大规模部署的可能性。由于人类的情感、意图甚至疾病可能浓缩在独特的面部表情或其中的变化中,因此需要一种便携且功能强大的解决方案。鉴于三维数据在面部形态学方面具有更高的信息价值,且某些综合征会在特定的面部畸形中表现出来,所以一种解决方案应允许实时便携地获取真实的三维面部扫描数据。在本研究中,我们提出了一种用于三维获取面部几何数据并从中识别面部表情的新解决方案。这里介绍的新技术仅需使用配备集成TrueDepth摄像头的智能手机或平板电脑,就能实时获取面部几何数据并将其分类为不同的面部表情。我们的方法包括两个部分:第一,通过让226名医学生采用特定的面部表情,同时用我们专门开发的运行在iPad上的应用程序捕捉他们当前的面部形态来获取训练数据,iPad放置在学生面前。参与者要展示的面部表情列表总共包括“失望”“压力大”“开心”“悲伤”和“惊讶”。第二,使用这些数据训练一个自归一化神经网络。一组在某一时刻描述当前面部表情的所有因素被称为“快照”。在这项研究中总共记录了超过50万个快照。最终,经过400个训练轮次后,该网络的总体准确率达到了80.54%。在测试中,确定总体准确率为81.15%。召回值因快照类别而异,从“压力大”的74.79%到“开心”的87.61%不等。精确率显示了类似的结果,其中“悲伤”的精确率最低,为77.48%,“惊讶”的精确率最高,为86.87%。通过本研究可以证明,即使使用存在一些挑战的数据集也能取得可观的结果。通过已纳入我们应用程序优化版本的各种措施,预计未来训练结果能得到显著改善并更加精确。目前正在进行一项针对我们应用程序新版本的后续研究,该版本包含了建议的更改和调整。我们的目标是建立一个大型的、开放的面部扫描数据库,不仅用于面部表情识别,还用于疾病识别和监测疾病的治疗进展。