Zhang Xuedong, Sun Wenlei, Chen Ke, Song Shijie
School of Intelligent Manufacturing Modern Industry, Xinjiang University, Urumqi, 830046, Xinjiang, China.
TBEA Co., Ltd., Changji, 831100, Xinjiang, China.
Sci Rep. 2025 Jan 2;15(1):98. doi: 10.1038/s41598-024-83561-7.
To achieve real-time monitoring and intelligent maintenance of transformers, a framework based on deep vision and digital twin has been developed. An enhanced visual detection model, DETR + X, is proposed, implementing multidimensional sample data augmentation through Swin2SR and GAN networks. This model converts one-dimensional DGA data into three-dimensional feature images based on Gram angle fields, facilitating the transformation and fusion of heterogeneous modal information. The Pyramid Vision Transformer (PVT) is innovatively adopted as the backbone for image feature extraction, replacing the traditional ResNet structure. A Deformable Attention mechanism is employed to handle the complex spatial structure of multi-scale features. Testing results indicate that the improved DETR + X model performs well in transformer state recognition tasks, achieving a classification accuracy of 100% for DGA feature maps. In object detection tasks, it surpasses advanced models such as Faster R-CNN, RetinaNet, YOLOv8, and Deformable DETR in terms of overall mAP50 scores, particularly demonstrating significant enhancements in small object detection. Furthermore, the Llava-7b model, fine-tuned based on domain expertise, serves as an expert decision-making tool for transformer maintenance, providing accurate operational recommendations based on visual detection results. Finally, based on digital twin and inference models, a comprehensive platform has been developed to achieve real-time monitoring and intelligent maintenance of transformers.
为实现变压器的实时监测与智能维护,已开发出一种基于深度视觉和数字孪生的框架。提出了一种增强型视觉检测模型DETR + X,通过Swin2SR和GAN网络实现多维样本数据增强。该模型基于Gram角场将一维DGA数据转换为三维特征图像,便于异构模态信息的转换与融合。创新性地采用金字塔视觉Transformer(PVT)作为图像特征提取的主干,取代了传统的ResNet结构。采用可变形注意力机制处理多尺度特征的复杂空间结构。测试结果表明,改进后的DETR + X模型在变压器状态识别任务中表现良好,对DGA特征图的分类准确率达到100%。在目标检测任务中,其整体mAP50分数超过了Faster R-CNN、RetinaNet、YOLOv8和可变形DETR等先进模型,尤其在小目标检测方面有显著提升。此外,基于领域专业知识进行微调的Llava-7b模型,作为变压器维护的专家决策工具,根据视觉检测结果提供准确的运行建议。最后,基于数字孪生和推理模型,开发了一个综合平台,以实现变压器的实时监测与智能维护。