Suppr超能文献

IDNet:用于快速全景分割的信息分解网络。

IDNet: Information Decomposition Network for Fast Panoptic Segmentation.

作者信息

Lin Guangchen, Li Songyuan, Chen Yifeng, Li Xi

出版信息

IEEE Trans Image Process. 2024;33:1487-1496. doi: 10.1109/TIP.2023.3234499. Epub 2024 Feb 21.

Abstract

Traditional CNN-based pipelines for panoptic segmentation decompose the task into two subtasks, i.e., instance segmentation and semantic segmentation. In this way, they extract information with multiple branches, perform two subtasks separately and finally fuse the results. However, excessive feature extraction and complicated processes make them time-consuming. We propose IDNet to decompose panoptic segmentation at information level. IDNet only extracts two kinds of information and directly completes panoptic segmentation task, saving the efforts to extract extra information and to fuse subtasks. By decomposing panoptic segmentation into category information and location information and recomposing them with a serial pipeline, the process for panoptic segmentation is simplified greatly and unified with regard to stuff and things. We also adopt two correction losses specially designed for our serial pipeline, guaranteeing the overall predicting performance. As a result, IDNet strikes a better balance between effectiveness and efficiency, achieving the fastest inference speed of 24.2 FPS at a resolution of 800×1333 on a Tesla V100 GPU and a PQ of 43.8, which is comparable in one-stage CNN-based methods. The code will be released at https://github.com/AronLin/IDNet.

摘要

传统的基于卷积神经网络(CNN)的全景分割流水线将任务分解为两个子任务,即实例分割和语义分割。通过这种方式,它们利用多个分支提取信息,分别执行两个子任务,最后融合结果。然而,过多的特征提取和复杂的流程使它们耗时较长。我们提出了IDNet,在信息层面分解全景分割任务。IDNet只提取两种信息,并直接完成全景分割任务,省去了提取额外信息和融合子任务的工作。通过将全景分割分解为类别信息和位置信息,并使用串行流水线重新组合它们,全景分割的过程被大大简化,并且在处理“stuff”和“things”方面实现了统一。我们还采用了专门为我们的串行流水线设计的两种校正损失,保证了整体预测性能。结果,IDNet在有效性和效率之间取得了更好的平衡,在Tesla V100 GPU上以800×1333的分辨率实现了24.2 FPS的最快推理速度和43.8的PQ,这与基于CNN的单阶段方法相当。代码将在https://github.com/AronLin/IDNet上发布。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验