结肠镜检查中基于跨任务一致性的多任务学习以改善深度估计。

Multi-task learning with cross-task consistency for improved depth estimation in colonoscopy.

机构信息

School of Computer Science, Faculty of Engineering and Physical Sciences, University of Leeds, Leeds, LS2 9JT, United Kingdom.

Department of Gastroenterology, Leeds Teaching Hospitals NHS Trust, Leeds, UK; Division of Gastroenterology and Surgical Sciences Leeds Institute of Medical Research at St James's University of Leeds, Leeds, UK.

出版信息

Med Image Anal. 2025 Jan;99:103379. doi: 10.1016/j.media.2024.103379. Epub 2024 Nov 4.

DOI:10.1016/j.media.2024.103379

PMID:39536401

Abstract

Colonoscopy screening is the gold standard procedure for assessing abnormalities in the colon and rectum, such as ulcers and cancerous polyps. Measuring the abnormal mucosal area and its 3D reconstruction can help quantify the surveyed area and objectively evaluate disease burden. However, due to the complex topology of these organs and variable physical conditions, for example, lighting, large homogeneous texture, and image modality estimating distance from the camera (aka depth) is highly challenging. Moreover, most colonoscopic video acquisition is monocular, making the depth estimation a non-trivial problem. While methods in computer vision for depth estimation have been proposed and advanced on natural scene datasets, the efficacy of these techniques has not been widely quantified on colonoscopy datasets. As the colonic mucosa has several low-texture regions that are not well pronounced, learning representations from an auxiliary task can improve salient feature extraction, allowing estimation of accurate camera depths. In this work, we propose to develop a novel multi-task learning (MTL) approach with a shared encoder and two decoders, namely a surface normal decoder and a depth estimator decoder. Our depth estimator incorporates attention mechanisms to enhance global context awareness. We leverage the surface normal prediction to improve geometric feature extraction. Also, we apply a cross-task consistency loss among the two geometrically related tasks, surface normal and camera depth. We demonstrate an improvement of 15.75% on relative error and 10.7% improvement on δ accuracy over the most accurate baseline state-of-the-art Big-to-Small (BTS) approach. All experiments are conducted on a recently released C3VD dataset, and thus, we provide a first benchmark of state-of-the-art methods on this dataset.

摘要

结肠镜筛查是评估结肠和直肠异常的金标准程序，例如溃疡和癌性息肉。测量异常黏膜面积及其 3D 重建可以帮助量化受检区域并客观评估疾病负担。然而，由于这些器官的拓扑结构复杂且物理条件变化，例如照明、大均匀纹理和图像模式估计距相机的距离（即深度）极具挑战性。此外，大多数结肠镜视频采集是单目，使得深度估计成为一个非平凡的问题。虽然计算机视觉中已经提出并在自然场景数据集上推进了用于深度估计的方法，但这些技术在结肠镜数据集上的功效尚未得到广泛量化。由于结肠黏膜有几个低纹理区域，特征不明显，因此从辅助任务中学习表示可以改善显著特征提取，从而可以估计准确的相机深度。在这项工作中，我们提出开发一种具有共享编码器和两个解码器的新的多任务学习 (MTL) 方法，即表面法线解码器和深度估计解码器。我们的深度估计器结合了注意力机制，以增强全局上下文感知能力。我们利用表面法线预测来改善几何特征提取。此外，我们在两个几何相关任务（表面法线和相机深度）之间应用了跨任务一致性损失。与最准确的基线最先进的 Big-to-Small (BTS) 方法相比，我们的方法在相对误差上提高了 15.75%，在δ准确性上提高了 10.7%。所有实验均在最近发布的 C3VD 数据集上进行，因此，我们在该数据集上提供了最先进方法的第一个基准。

相似文献

Multi-task learning with cross-task consistency for improved depth estimation in colonoscopy.结肠镜检查中基于跨任务一致性的多任务学习以改善深度估计。

Med Image Anal. 2025 Jan;99:103379. doi: 10.1016/j.media.2024.103379. Epub 2024 Nov 4.

Unsupervised colonoscopic depth estimation by domain translations with a Lambertian-reflection keeping auxiliary task.基于 Lambertian 反射保持辅助任务的域变换的无监督结肠镜深度估计。

Int J Comput Assist Radiol Surg. 2021 Jun;16(6):989-1001. doi: 10.1007/s11548-021-02398-x. Epub 2021 May 17.

Colonoscopy 3D video dataset with paired depth from 2D-3D registration.具有通过二维到三维配准得到的配对深度信息的结肠镜检查三维视频数据集。

Med Image Anal. 2023 Dec;90:102956. doi: 10.1016/j.media.2023.102956. Epub 2023 Sep 7.

SLAM-based dense surface reconstruction in monocular Minimally Invasive Surgery and its application to Augmented Reality.基于 SLAM 的单目微创手术中密集表面重建及其在增强现实中的应用。

Comput Methods Programs Biomed. 2018 May;158:135-146. doi: 10.1016/j.cmpb.2018.02.006. Epub 2018 Feb 8.

How big is this neoplasia? live colonoscopic size measurement using the Infocus-Breakpoint.这个肿瘤有多大？使用 Infocus-Breakpoint 进行活结肠内镜下的尺寸测量。

Med Image Anal. 2015 Jan;19(1):58-74. doi: 10.1016/j.media.2014.09.002. Epub 2014 Sep 16.

Deep learning and conditional random fields-based depth estimation and topographical reconstruction from conventional endoscopy.基于深度学习和条件随机场的传统内窥镜深度估计和地形重建。

Med Image Anal. 2018 Aug;48:230-243. doi: 10.1016/j.media.2018.06.005. Epub 2018 Jun 14.

GTIGNet: Global Topology Interaction Graphormer Network for 3D hand pose estimation.GTIGNet：用于3D手部姿态估计的全局拓扑交互图变换器网络

Neural Netw. 2025 May;185:107221. doi: 10.1016/j.neunet.2025.107221. Epub 2025 Feb 4.

SimCol3D - 3D reconstruction during colonoscopy challenge.SimCol3D - 结肠镜检查中的 3D 重建挑战。

Med Image Anal. 2024 Aug;96:103195. doi: 10.1016/j.media.2024.103195. Epub 2024 May 15.

StaSiS-Net: A stacked and siamese disparity estimation network for depth reconstruction in modern 3D laparoscopy.StaSiS-Net：一种用于现代三维腹腔镜深度重建的堆叠式连体视差估计网络。

Med Image Anal. 2022 Apr;77:102380. doi: 10.1016/j.media.2022.102380. Epub 2022 Jan 30.

Leveraging a realistic synthetic database to learn Shape-from-Shading for estimating the colon depth in colonoscopy images.利用逼真的合成数据库学习从阴影中估计结肠镜图像中结肠深度的形状。

Comput Med Imaging Graph. 2024 Jul;115:102390. doi: 10.1016/j.compmedimag.2024.102390. Epub 2024 May 3.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

结肠镜检查中基于跨任务一致性的多任务学习以改善深度估计。

Multi-task learning with cross-task consistency for improved depth estimation in colonoscopy.

机构信息

出版信息

相似文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献