School of Traffic and Transportation, Beijing Jiaotong University, Beijing 100044, China.
State Key Laboratory of Rail Traffic Control and Safety, Beijing Jiaotong University, Beijing 100044, China.
Sensors (Basel). 2022 Aug 25;22(17):6419. doi: 10.3390/s22176419.
The state monitoring of the railway track line is one of the important tasks to ensure the safety of the railway transportation system. While the defect recognition result, that is, the inspection report, is the main basis for the maintenance decision. Most previous attempts have proposed intelligent detection methods to achieve rapid and accurate inspection of the safety state of the railway track line. However, there are few investigations on the automatic generation of inspection reports. Fortunately, inspired by the recent advances and successes in dense captioning, such technologies can be investigated and used to generate textual information on the type, position, status, and interrelationship of the key components from the field images. To this end, based on the work of DenseCap, a railway track line image captioning model (RTLCap for short) is proposed, which replaces VGG16 with ResNet-50-FPN as the backbone of the model to extract more powerful image features. In addition, towards the problems of object occlusion and category imbalance in the field images, Soft-NMS and Focal Loss are applied in RTLCap to promote defect description performance. After that, to improve the image processing speed of RTLCap and reduce the complexity of the model, a reconstructed RTLCap model named Faster RTLCap is presented with the help of YOLOv3. In the encoder part, a multi-level regional feature localization, mapping, and fusion module (MFLMF) are proposed to extract regional features, and an SPP (Spatial Pyramid Pooling) layer is employed after MFLMF to reduce model parameters. As for the decoder part, a stacked LSTM is adopted as the language model for better language representation learning. Both quantitative and qualitative experimental results demonstrate the effectiveness of the proposed methods.
铁路线路状态监测是保障铁路运输系统安全的重要任务之一。而缺陷识别结果,即检测报告,是维修决策的主要依据。以往的研究大多提出了智能检测方法,以实现铁路轨道线路安全状态的快速准确检测。然而,对于自动生成检测报告的研究还较少。幸运的是,受密集型字幕生成技术的最新进展和成功的启发,这些技术可以被研究和用于从现场图像中生成有关关键部件类型、位置、状态和相互关系的文本信息。为此,基于 DenseCap 的工作,提出了一种铁路轨道线路图像字幕生成模型(简称 RTLCap),该模型用 ResNet-50-FPN 替代 VGG16 作为模型的骨干网络,以提取更强大的图像特征。此外,针对现场图像中的目标遮挡和类别不平衡问题,在 RTLCap 中应用了 Soft-NMS 和 Focal Loss,以提高缺陷描述性能。之后,为了提高 RTLCap 的图像处理速度和降低模型的复杂度,借助 YOLOv3 提出了一种名为 Faster RTLCap 的重构 RTLCap 模型。在编码器部分,提出了一种多层次区域特征定位、映射和融合模块(MFLMF),用于提取区域特征,并在 MFLMF 后使用 SPP(空间金字塔池化)层减少模型参数。对于解码器部分,采用堆叠 LSTM 作为语言模型,以更好地进行语言表示学习。定量和定性实验结果均证明了所提方法的有效性。