Li Shiyan, Kong Qingqun, Gao Xuan, Shi Fangzhen, Li Lianghui, Zhang Qi, Wang Penghao, Yang Kehu
School of Artificial Intelligence, China University of Mining and Technology-Beijing, Beijing, 100083, China.
School of Energy and Mining Engineering, China University of Mining and Technology-Beijing, Beijing, 100083, China.
Sci Data. 2025 Jul 8;12(1):1160. doi: 10.1038/s41597-025-05493-9.
Visual perception is one of the core technologies for achieving unmanned and intelligent mining in underground mines. However, the harsh environment unique to underground mines poses significant challenges to visible light-based visual perception methods. Multimodal fusion semantic segmentation offers a promising solution, but the lack of dedicated multimodal datasets for underground mines severely limits its application in this field. This work develops a multimodal semantic segmentation benchmark dataset for complex underground mine scenes (MUSeg) to address this issue. The dataset comprises 3,171 aligned RGB and depth image pairs collected from six typical mines across different regions of China. According to the requirements of mine perception tasks, we manually annotated 15 categories of semantic objects, with all labels verified by mining experts. The dataset has also been evaluated using classical multimodal semantic segmentation algorithms. The MUSeg dataset not only fills the gap in this field but also provides a critical foundation for research and application of multimodal perception algorithms in mining, contributing significantly to the advancement of intelligent mining.
视觉感知是实现地下矿山无人化和智能化开采的核心技术之一。然而,地下矿山特有的恶劣环境给基于可见光的视觉感知方法带来了重大挑战。多模态融合语义分割提供了一个有前景的解决方案,但缺乏专门用于地下矿山的多模态数据集严重限制了其在该领域的应用。这项工作开发了一个用于复杂地下矿山场景的多模态语义分割基准数据集(MUSeg)来解决这个问题。该数据集包含从中国不同地区的六个典型矿山收集的3171对对齐的RGB和深度图像对。根据矿山感知任务的要求,我们手动标注了15类语义对象,所有标签均由采矿专家验证。该数据集还使用经典的多模态语义分割算法进行了评估。MUSeg数据集不仅填补了该领域的空白,还为多模态感知算法在采矿中的研究和应用提供了关键基础,为智能采矿的发展做出了重大贡献。