BME Dept, National University of Singapore (NUS).
BME Dept, National University of Singapore (NUS); Department of Instrumentation and Control Engineering, NIT Trichy, India.
Med Image Anal. 2021 Jan;67:101837. doi: 10.1016/j.media.2020.101837. Epub 2020 Oct 15.
Representation learning of the task-oriented attention while tracking instrument holds vast potential in image-guided robotic surgery. Incorporating cognitive ability to automate the camera control enables the surgeon to concentrate more on dealing with surgical instruments. The objective is to reduce the operation time and facilitate the surgery for both surgeons and patients. We propose an end-to-end trainable Spatio-Temporal Multi-Task Learning (ST-MTL) model with a shared encoder and spatio-temporal decoders for the real-time surgical instrument segmentation and task-oriented saliency detection. In the MTL model of shared-parameters, optimizing multiple loss functions into a convergence point is still an open challenge. We tackle the problem with a novel asynchronous spatio-temporal optimization (ASTO) technique by calculating independent gradients for each decoder. We also design a competitive squeeze and excitation unit by casting a skip connection that retains weak features, excites strong features, and performs dynamic spatial and channel-wise feature recalibration. To capture better long term spatio-temporal dependencies, we enhance the long-short term memory (LSTM) module by concatenating high-level encoder features of consecutive frames. We also introduce Sinkhorn regularized loss to enhance task-oriented saliency detection by preserving computational efficiency. We generate the task-aware saliency maps and scanpath of the instruments on the dataset of the MICCAI 2017 robotic instrument segmentation challenge. Compared to the state-of-the-art segmentation and saliency methods, our model outperforms most of the evaluation metrics and produces an outstanding performance in the challenge.
在图像引导机器人手术中,任务导向注意的表示学习具有巨大的潜力。将认知能力融入到相机控制自动化中,可以使外科医生更加专注于处理手术器械。其目的是减少手术时间,为外科医生和患者都提供便利。我们提出了一种端到端可训练的时空多任务学习(ST-MTL)模型,该模型具有共享编码器和时空解码器,用于实时手术器械分割和任务导向显著性检测。在共享参数的 MTL 模型中,将多个损失函数优化到收敛点仍然是一个开放的挑战。我们通过为每个解码器计算独立的梯度来解决这个问题,使用一种新颖的异步时空优化(ASTO)技术。我们还设计了一个具有竞争力的挤压和激励单元,通过施加一个保留弱特征、激励强特征并执行动态空间和通道特征重新校准的跳过连接。为了更好地捕捉长期时空依赖关系,我们通过将连续帧的高级编码器特征串联起来,增强了长短时记忆(LSTM)模块。我们还引入了 Sinkhorn 正则化损失,通过保持计算效率来增强任务导向显著性检测。我们在 MICCAI 2017 机器人器械分割挑战赛的数据集上生成了任务感知显著性图和器械扫描路径。与最先进的分割和显著性方法相比,我们的模型在大多数评估指标上都表现出色,并在挑战赛中表现出色。