Zhang Taohong, Fan Suli, Hu Junnan, Guo Xuxu, Li Qianqian, Zhang Ying, Wulamu Aziguli
Department of Computer, School of Computer and Communication Engineering, University of Science and Technology Beijing (USTB), Beijing 100083, China.
Beijing Key Laboratory of Knowledge Engineering for Materials Science, Beijing 100083, China.
Comput Intell Neurosci. 2021 Apr 14;2021:6647220. doi: 10.1155/2021/6647220. eCollection 2021.
In this paper, a feature fusion method with guiding training (FGT-Net) is constructed to fuse image data and numerical data for some specific recognition tasks which cannot be classified accurately only according to images. The proposed structure is divided into the shared weight network part, the feature fused layer part, and the classification layer part. First, the guided training method is proposed to optimize the training process, the representative images and training images are input into the shared weight network to learn the ability that extracts the image features better, and then the image features and numerical features are fused together in the feature fused layer to input into the classification layer for the classification task. Experiments are carried out to verify the effectiveness of the proposed model. Loss is calculated by the output of both the shared weight network and classification layer. The results of experiments show that the proposed FGT-Net achieves the accuracy of 87.8%, which is 15% higher than the CNN model of ShuffleNetv2 (which can process image data only) and 9.8% higher than the DNN method (which processes structured data only).
本文构建了一种带引导训练的特征融合方法(FGT-Net),用于融合图像数据和数值数据,以解决某些仅根据图像无法准确分类的特定识别任务。所提出的结构分为共享权重网络部分、特征融合层部分和分类层部分。首先,提出引导训练方法来优化训练过程,将代表性图像和训练图像输入到共享权重网络中,以学习更好地提取图像特征的能力,然后在特征融合层中将图像特征和数值特征融合在一起,输入到分类层进行分类任务。通过实验验证了所提模型的有效性。损失由共享权重网络和分类层的输出计算得出。实验结果表明,所提的FGT-Net达到了87.8%的准确率,比仅能处理图像数据的ShuffleNetv2的CNN模型高15%,比仅处理结构化数据的DNN方法高9.8%。