Xu Xueli, Geng Guohua, Cao Xin, Li Kang, Zhou Mingquan
Appl Opt. 2022 Feb 20;61(6):C80-C88. doi: 10.1364/AO.438396.
This study proposes a novel, to the best of our knowledge, transformer-based end-to-end network (TDNet) for point cloud denoising based on encoder-decoder architecture. The encoder is based on the structure of a transformer in natural language processing (NLP). Even though points and sentences are different types of data, the NLP transformer can be improved to be suitable for a point cloud because the point can be regarded as a word. The improved model facilitates point cloud feature extraction and transformation of the input point cloud into the underlying high-dimensional space, which can characterize the semantic relevance between points. Subsequently, the decoder learns the latent manifold of each sampled point from the high-dimensional features obtained by the encoder, finally achieving a clean point cloud. An adaptive sampling approach is introduced during denoising to select points closer to the clean point cloud to reconstruct the surface. This is based on the view that a 3D object is essentially a 2D manifold. Extensive experiments demonstrate that the proposed network is superior in terms of quantitative and qualitative results for synthetic data sets and real-world terracotta warrior fragments.
据我们所知,本研究提出了一种基于编码器-解码器架构的、新颖的基于变压器的端到端网络(TDNet)用于点云去噪。编码器基于自然语言处理(NLP)中的变压器结构。尽管点和句子是不同类型的数据,但由于点可以被视为一个单词,因此NLP变压器可以改进以适用于点云。改进后的模型有助于点云特征提取,并将输入点云转换到潜在的高维空间,该空间可以表征点之间的语义相关性。随后,解码器从编码器获得的高维特征中学习每个采样点的潜在流形,最终得到一个干净的点云。在去噪过程中引入了一种自适应采样方法,以选择更接近干净点云的点来重建表面。这是基于这样一种观点,即三维物体本质上是二维流形。大量实验表明,所提出的网络在合成数据集和真实世界兵马俑碎片的定量和定性结果方面都更具优势。