Yu Xiaowei, Zhang Lu, Dai Haixing, Lyu Yanjun, Zhao Lin, Wu Zihao, Liu David, Liu Tianming, Zhu Dajiang
ArXiv. 2023 Mar 27:arXiv:2303.15569v1.
Designing more efficient, reliable, and explainable neural network architectures is critical to studies that are based on artificial intelligence (AI) techniques. Previous studies, by post-hoc analysis, have found that the best-performing ANNs surprisingly resemble biological neural networks (BNN), which indicates that ANNs and BNNs may share some common principles to achieve optimal performance in either machine learning or cognitive/behavior tasks. Inspired by this phenomenon, we proactively instill organizational principles of BNNs to guide the redesign of ANNs. We leverage the Core-Periphery (CP) organization, which is widely found in human brain networks, to guide the information communication mechanism in the self-attention of vision transformer (ViT) and name this novel framework as CP-ViT. In CP-ViT, the attention operation between nodes is defined by a sparse graph with a Core-Periphery structure (CP graph), where the core nodes are redesigned and reorganized to play an integrative role and serve as a center for other periphery nodes to exchange information. We evaluated the proposed CP-ViT on multiple public datasets, including medical image datasets (INbreast) and natural image datasets. Interestingly, by incorporating the BNN-derived principle (CP structure) into the redesign of ViT, our CP-ViT outperforms other state-of-the-art ANNs. In general, our work advances the state of the art in three aspects: 1) This work provides novel insights for brain-inspired AI: we can utilize the principles found in BNNs to guide and improve our ANN architecture design; 2) We show that there exist sweet spots of CP graphs that lead to CP-ViTs with significantly improved performance; and 3) The core nodes in CP-ViT correspond to task-related meaningful and important image patches, which can significantly enhance the interpretability of the trained deep model.
设计更高效、可靠且可解释的神经网络架构对于基于人工智能(AI)技术的研究至关重要。先前的研究通过事后分析发现,性能最佳的人工神经网络(ANN)惊人地类似于生物神经网络(BNN),这表明ANN和BNN可能共享一些共同原则,以便在机器学习或认知/行为任务中实现最佳性能。受此现象启发,我们主动灌输BNN的组织原则来指导ANN的重新设计。我们利用在人类大脑网络中广泛发现的核心-外围(CP)组织来指导视觉Transformer(ViT)自注意力中的信息通信机制,并将这个新颖的框架命名为CP-ViT。在CP-ViT中,节点之间的注意力操作由具有核心-外围结构的稀疏图(CP图)定义,其中核心节点经过重新设计和重组以发挥整合作用,并作为其他外围节点交换信息的中心。我们在多个公共数据集上评估了所提出的CP-ViT,包括医学图像数据集(INbreast)和自然图像数据集。有趣的是,通过将源自BNN的原则(CP结构)纳入ViT的重新设计中,我们的CP-ViT优于其他先进的ANN。总体而言,我们的工作在三个方面推动了技术发展:1)这项工作为受大脑启发的AI提供了新颖的见解:我们可以利用BNN中发现的原则来指导和改进我们的ANN架构设计;2)我们表明存在CP图的最佳点,这些点会导致CP-ViT的性能显著提高;3)CP-ViT中的核心节点对应于与任务相关的有意义且重要的图像块,这可以显著增强训练后的深度模型的可解释性。