Zhang Jiyan, Ding Hanze, Wu Zhangkai, Peng Ming, Liu Yanfang
College of Mathematics and Information Engineering, Longyan University, Longyan, China.
School of Computer Science, University of Technology Sydney, New South Wales, Australia.
PLoS One. 2025 Mar 17;20(3):e0318553. doi: 10.1371/journal.pone.0318553. eCollection 2025.
Given their fast generalization capability for unseen classes and segmentation ability at pixel scale, models based on few-shot segmentation perform well in solving data insufficiency problems during metal defect detection and in delineating refined objects under industrial scenarios. Extant researches fail to consider the inherent intra-class differences in data about metal surface defects, so that the models can hardly learn enough information from the support set for guiding the segmentation of query set. Specifically, it can be categorized into two types: the semantic intra-class difference induced by internal factors in metal samples and the distortion intra-class difference caused by external factors of surroundings. To address these differences, we introduce a Local Descriptor-based Multi-Prototype Reasoning and Excitation Network (LDMP-RENet) to learn the two-view guidance, i.e., the local information from the graph space and the global information from the feature space, and fuse them to segment precisely. Given the contribution of relational structure of graph space-embedded local features to the Semantic Difference obviation, a multi-prototype reasoning module is utilized to extract local descriptors-based prototypes and to assess relevance between local-view features in support-query set pairs. Meanwhile, since global information helps obviate Distortion Difference in observations, a multi-prototype excitation module is employed for capturing global-view relevance in the above pairs. Lastly, an information fusion module is employed to integrate the learned prototypes in both global and local views, thereby creating pixel-level masks. Thorough experiments are conducted on defect datasets, revealing the superiority of proposed network to extant benchmarks, which sets a new state-of-the-art.
鉴于基于少样本分割的模型在未见过的类别上具有快速泛化能力以及在像素尺度上的分割能力,它们在解决金属缺陷检测过程中的数据不足问题以及在工业场景中描绘精细物体方面表现出色。现有研究未能考虑金属表面缺陷数据中固有的类内差异,因此模型很难从支持集中学习到足够的信息来指导查询集的分割。具体来说,它可以分为两种类型:由金属样本内部因素引起的语义类内差异和由周围环境外部因素导致的失真类内差异。为了解决这些差异,我们引入了一种基于局部描述符的多原型推理与激励网络(LDMP-RENet)来学习两种视角的指导信息,即来自图空间的局部信息和来自特征空间的全局信息,并将它们融合以进行精确分割。鉴于嵌入图空间的局部特征的关系结构对消除语义差异的贡献,一个多原型推理模块被用于提取基于局部描述符的原型,并评估支持-查询集对中局部视角特征之间的相关性。同时,由于全局信息有助于消除观测中的失真差异,一个多原型激励模块被用于捕捉上述对中的全局视角相关性。最后,一个信息融合模块被用于整合在全局和局部视角中学习到的原型,从而创建像素级掩码。我们在缺陷数据集上进行了全面的实验,揭示了所提出网络相对于现有基准的优越性,这创造了新的最先进水平。