Li Chenglong, Xue Wanlin, Jia Yaqing, Qu Zhichen, Luo Bin, Tang Jin, Sun Dengdi
IEEE Trans Image Process. 2022;31:392-404. doi: 10.1109/TIP.2021.3130533. Epub 2021 Dec 9.
RGBT tracking receives a surge of interest in the computer vision community, but this research field lacks a large-scale and high-diversity benchmark dataset, which is essential for both the training of deep RGBT trackers and the comprehensive evaluation of RGBT tracking methods. To this end, we present a La rge- s cale H igh-diversity [Formula: see text]nchmark for short-term R GBT tracking (LasHeR) in this work. LasHeR consists of 1224 visible and thermal infrared video pairs with more than 730K frame pairs in total. Each frame pair is spatially aligned and manually annotated with a bounding box, making the dataset well and densely annotated. LasHeR is highly diverse capturing from a broad range of object categories, camera viewpoints, scene complexities and environmental factors across seasons, weathers, day and night. We conduct a comprehensive performance evaluation of 12 RGBT tracking algorithms on the LasHeR dataset and present detailed analysis. In addition, we release the unaligned version of LasHeR to attract the research interest for alignment-free RGBT tracking, which is a more practical task in real-world applications. The datasets and evaluation protocols are available at: https://github.com/mmic-lcl/Datasets-and-benchmark-code.
RGB-T跟踪在计算机视觉社区中受到了广泛关注,但该研究领域缺乏大规模、高多样性的基准数据集,这对于深度RGB-T跟踪器的训练以及RGB-T跟踪方法的综合评估都至关重要。为此,我们在这项工作中提出了一个用于短期RGB-T跟踪的大规模高多样性基准数据集(LasHeR)。LasHeR由1224对可见光和热红外视频组成,总共超过73万个帧对。每对帧在空间上对齐,并手动标注有边界框,使得该数据集标注良好且密集。LasHeR具有高度多样性,涵盖了广泛的物体类别、相机视角、场景复杂性以及跨越季节、天气、白天和黑夜的环境因素。我们在LasHeR数据集上对12种RGB-T跟踪算法进行了全面的性能评估,并给出了详细分析。此外,我们还发布了LasHeR的未对齐版本,以吸引对无对齐RGB-T跟踪的研究兴趣,这在实际应用中是一项更具实际意义的任务。数据集和评估协议可在以下网址获取:https://github.com/mmic-lcl/Datasets-and-benchmark-code 。