Wang Shulan, Liu Siyu, Jin Mengting, Fan Pingmei
School of Architecture and Art Design, Hebei University of Technology, Tianjin, China.
School of Information and Artificial Intelligence, Anhui Business College, Anhui, China.
PLoS One. 2025 Aug 4;20(8):e0328507. doi: 10.1371/journal.pone.0328507. eCollection 2025.
Mural image recognition plays a critical role in the digital preservation of cultural heritage; however, it faces cross-cultural and multi-period style generalization challenges, compounded by limited sample sizes and intricate details, such as losses caused by natural weathering of mural surfaces and complex artistic patterns.This paper proposes a deep learning model based on DenseNet201-FPN, incorporating a Bidirectional Convolutional Block Attention Module (Bi-CBAM), dynamic focal distillation loss, and convex regularization. First, a lightweight Feature Pyramid Network (FPN) is embedded into DenseNet201 to fuse multi-scale texture features (28 × 28 × 256, 14 × 14 × 512, 7 × 7 × 1024). Second, a bidirectional LSTM-driven attention module iteratively optimizes channel and spatial weights, enhancing detail perception for low-frequency categories. Third, a dynamic temperature distillation strategy (T = 3 → 1) balances supervision from teacher models (ResNeXt101) and ground truth, improving the F1-score of rare classes by 6.1%. Experimental results on a self-constructed mural dataset (2,000 images,26 subcategories.) demonstrate 87.9% accuracy (+3.7% over DenseNet201) and real-time inference on edge devices (63ms/frame at 8.1W on Jetson TX2). This study provides a cost-effective solution for large-scale mural digitization in resource-constrained environments.
壁画图像识别在文化遗产的数字保存中起着关键作用;然而,它面临着跨文化和多时期风格泛化的挑战,样本量有限以及壁画表面自然风化和复杂艺术图案等复杂细节所带来的问题,例如造成的损失。本文提出了一种基于DenseNet201 - FPN的深度学习模型,融合了双向卷积块注意力模块(Bi - CBAM)、动态焦点蒸馏损失和凸正则化。首先,将一个轻量级特征金字塔网络(FPN)嵌入到DenseNet201中,以融合多尺度纹理特征(28×28×256、14×14×512、7×7×1024)。其次,一个双向LSTM驱动的注意力模块迭代优化通道和空间权重,增强对低频类别的细节感知。第三,一种动态温度蒸馏策略(T = 3→1)平衡了来自教师模型(ResNeXt101)和真实标签的监督,将稀有类别的F1分数提高了6.1%。在自建的壁画数据集(2000张图像,26个子类别)上的实验结果表明,准确率达到87.9%(比DenseNet201提高了3.7%),并且能够在边缘设备上进行实时推理(在Jetson TX2上,8.1W时为63ms/帧)。本研究为资源受限环境下的大规模壁画数字化提供了一种经济高效的解决方案。