Suppr超能文献

EnNet:具有零阶优化的增强型交互式信息网络。

EnNet: Enhanced Interactive Information Network with Zero-Order Optimization.

作者信息

Shao Yingzhao, Chen Yanxin, Yang Pengfei, Cheng Fei

机构信息

State Key Laboratory of Integrated Services Networks, Xidian University, Xi'an 710071, China.

Key Laboratory of Smart Human-Computer Interaction and Wearable Technology of Shaanxi Province, Xi'an 710071, China.

出版信息

Sensors (Basel). 2024 Sep 30;24(19):6361. doi: 10.3390/s24196361.

Abstract

Interactive image segmentation extremely accelerates the generation of high-quality annotation image datasets, which are the pillars of the applications of deep learning. However, these methods suffer from the insignificance of interaction information and excessively high optimization costs, resulting in unexpected segmentation outcomes and increased computational burden. To address these issues, this paper focuses on interactive information mining from the network architecture and optimization procedure. In terms of network architecture, the issue mentioned above arises from two perspectives: the less representative feature of interactive regions in each layer and the interactive information weakened by the network hierarchy structure. Therefore, the paper proposes a network called EnNet. The network addresses the two aforementioned issues by employing attention mechanisms to integrate user interaction information across the entire image and incorporating interaction information twice in a design that progresses from coarse to fine. In terms of optimization, this paper proposes a method of using zero-order optimization during the first four iterations of training. This approach can reduce computational overhead with only a minimal reduction in accuracy. The experimental results on GrabCut, Berkeley, DAVIS, and SBD datasets validate the effectiveness of the proposed method, with our approach achieving an average NOC@90 that surpasses RITM by 0.35.

摘要

交互式图像分割极大地加速了高质量标注图像数据集的生成,而这些数据集是深度学习应用的支柱。然而,这些方法存在交互信息不显著以及优化成本过高的问题,导致分割结果不理想且计算负担增加。为了解决这些问题,本文着重于从网络架构和优化过程中挖掘交互信息。在网络架构方面,上述问题从两个角度产生:每层中交互区域的特征代表性不足,以及网络层次结构削弱了交互信息。因此,本文提出了一种名为EnNet的网络。该网络通过采用注意力机制在整个图像上整合用户交互信息,并在从粗到精的设计中两次纳入交互信息,解决了上述两个问题。在优化方面,本文提出了一种在训练的前四次迭代中使用零阶优化的方法。这种方法可以减少计算开销,同时精度仅有极小的降低。在GrabCut、Berkeley、DAVIS和SBD数据集上的实验结果验证了所提方法的有效性,我们的方法实现的平均NOC@90比RITM高出0.35。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ffd/11478929/4ef3cb376abb/sensors-24-06361-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验