Suppr超能文献

基于多尺度注意力和层次级别增强的域泛化中的人群计数

Crowd counting in domain generalization based on multi-scale attention and hierarchy level enhancement.

作者信息

Zhou Jiarui, Zhang Jianming, Gui Yan

机构信息

School of Computer and Communication Engineering, Changsha University of Science and Technology, Changsha, 410114, China.

出版信息

Sci Rep. 2025 Jan 2;15(1):155. doi: 10.1038/s41598-024-83725-5.

Abstract

In order to solve the problem of weak single domain generalization ability in existing crowd counting methods, this study proposes a new crowd counting framework called Multi-scale Attention and Hierarchy level Enhancement (MAHE). Firstly, the model can focus on both the detailed features and the macro information of structural position changes through the fusion of channel attention and spatial attention. Secondly, the addition of multi-head attention feature module facilitates the model's capacity to effectively capture complex dependency relationships between sequence elements. In addition, the three-stage encoding and decoding processing mode enables the model to effectively represent crowd density information. Finally, the fusion of multi-scale features derived from different receptive fields is further enhanced through multi-scale hierarchy level feature fusion, thereby enabling the model to learn high-level semantic information and low-level multi-scale visual field feature information. This method enhances the model's capacity to capture key feature information, even in highly differentiated datasets, thereby improving the model's generalization ability on a single domain. The model has demonstrated strong generalization capabilities through extensive experiments on different datasets. This study not only improves the accuracy of crowd counting, but also introduces a new research approach for single domain generalization of crowd counting.

摘要

为了解决现有人群计数方法中单域泛化能力较弱的问题,本研究提出了一种名为多尺度注意力与层次增强(MAHE)的新型人群计数框架。首先,该模型通过通道注意力和空间注意力的融合,能够同时关注结构位置变化的细节特征和宏观信息。其次,多头注意力特征模块的加入有助于模型有效捕捉序列元素之间复杂的依赖关系。此外,三阶段编码和解码处理模式使模型能够有效表示人群密度信息。最后,通过多尺度层次特征融合进一步增强了来自不同感受野的多尺度特征的融合,从而使模型能够学习高级语义信息和低级多尺度视野特征信息。该方法增强了模型捕捉关键特征信息的能力,即使在高度分化的数据集中也是如此,从而提高了模型在单域上的泛化能力。通过在不同数据集上的大量实验,该模型展示了强大的泛化能力。本研究不仅提高了人群计数的准确性,还为人群计数的单域泛化引入了一种新的研究方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c130/11696471/5befa2941ba2/41598_2024_83725_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验