Svirsky Yonatan, Sharf Andrei
IEEE Trans Vis Comput Graph. 2021 Jul;27(7):3238-3249. doi: 10.1109/TVCG.2020.2968062. Epub 2021 May 27.
In this article we introduce a differentiable rendering module which allows neural networks to efficiently process 3D data. The module is composed of continuous piecewise differentiable functions defined as a sensor array of cells embedded in 3D space. Our module is learnable and can be easily integrated into neural networks allowing to optimize data rendering towards specific learning tasks using gradient based methods in an end-to-end fashion. Essentially, the module's sensor cells are allowed to transform independently and locally focus and sense different parts of the 3D data. Thus, through their optimization process, cells learn to focus on important parts of the data, bypassing occlusions, clutter, and noise. Since sensor cells originally lie on a grid, this equals to a highly non-linear rendering of the scene into a 2D image. Our module performs especially well in presence of clutter and occlusions as well as dealing with non-linear deformations to improve classification accuracy through proper rendering of the data. In our experiments, we apply our module in various learning tasks and demonstrate that using our rendering module we accomplish efficient classification, localization, and segmentation tasks on 2D/3D cluttered and non-cluttered data.
在本文中,我们介绍了一种可微渲染模块,它使神经网络能够高效地处理三维数据。该模块由连续的分段可微函数组成,这些函数被定义为嵌入三维空间中的细胞传感器阵列。我们的模块是可学习的,并且可以很容易地集成到神经网络中,从而能够以端到端的方式使用基于梯度的方法针对特定的学习任务优化数据渲染。本质上,该模块的传感器细胞可以独立变换,并局部聚焦和感知三维数据的不同部分。因此,通过它们的优化过程,细胞学会聚焦于数据的重要部分,绕过遮挡、杂波和噪声。由于传感器细胞最初位于网格上,这相当于将场景高度非线性地渲染为二维图像。我们的模块在存在杂波和遮挡的情况下表现尤其出色,并且在处理非线性变形方面也表现良好,通过对数据进行适当的渲染来提高分类精度。在我们的实验中,我们将我们的模块应用于各种学习任务,并证明使用我们的渲染模块,我们能够在二维/三维杂乱和非杂乱数据上完成高效的分类、定位和分割任务。