National Key Laboratory of Fundamental Science on Synthetic Vision, Chengdu 610065, China.
College of Computer Science, Sichuan University, Chengdu 610065, China.
Sensors (Basel). 2022 Oct 29;22(21):8293. doi: 10.3390/s22218293.
As one of the pioneering data representations, the point cloud has shown its straightforward capacity to depict fine geometry in many applications, including computer graphics, molecular structurology, modern sensing signal processing, and more. However, unlike computer graphs obtained with auxiliary regularization techniques or from syntheses, raw sensor/scanner (metric) data often contain natural random noise caused by multiple extrinsic factors, especially in the case of high-speed imaging scenarios. On the other hand, grid-like imaging techniques (e.g., RGB images or video frames) tend to entangle interesting aspects with environmental variations such as pose/illuminations with Euclidean sampling/processing pipelines. As one such typical problem, 3D Facial Expression Recognition (3D FER) has been developed into a new stage, with remaining difficulties involving the implementation of efficient feature abstraction methods for high dimensional observations and of stabilizing methods to obtain adequate robustness in cases of random exterior variations. In this paper, a localized and smoothed overlapping kernel is proposed to extract discriminative inherent geometric features. By association between the induced deformation stability and certain types of exterior perturbations through manifold scattering transform, we provide a novel framework that directly consumes point cloud coordinates for FER while requiring no predefined meshes or other features/signals. As a result, our compact framework achieves 78.33% accuracy on the Bosphorus dataset for expression recognition challenge and 77.55% on 3D-BUFE.
作为一种开创性的数据表示形式,点云在许多应用中展示了其直接描绘精细几何形状的能力,包括计算机图形学、分子结构学、现代传感信号处理等。然而,与通过辅助正则化技术或合成获得的计算机图形不同,原始传感器/扫描仪(度量)数据通常包含由多种外部因素引起的自然随机噪声,特别是在高速成像场景中。另一方面,网格状成像技术(例如 RGB 图像或视频帧)往往会将有趣的方面与环境变化(例如姿势/照明)纠缠在一起,而这些变化是通过欧几里得采样/处理流水线进行的。作为这样的一个典型问题,3D 面部表情识别(3D FER)已经发展到了一个新的阶段,仍然存在的困难包括实现高效的特征抽象方法来处理高维观测,以及获得足够鲁棒性的稳定方法,以应对随机外部变化。在本文中,提出了一种局部化和平滑的重叠核来提取具有判别力的内在几何特征。通过关联诱导变形稳定性和通过流形散射变换的某些类型的外部扰动,我们提供了一个新颖的框架,该框架直接使用点云坐标进行 FER,而不需要预定义的网格或其他特征/信号。因此,我们的紧凑框架在 Bosphorus 数据集的表情识别挑战中达到了 78.33%的准确率,在 3D-BUFE 中达到了 77.55%的准确率。