Institute of Technical Medicine (ITeM), Furtwangen University, 78054 Villingen-Schwenningen, Germany.
Innovation Center Computer Assisted Surgery (ICCAS), University of Leipzig, 04103 Leipzig, Germany.
Sensors (Basel). 2023 Feb 9;23(4):1958. doi: 10.3390/s23041958.
Adapting intelligent context-aware systems (CAS) to future operating rooms (OR) aims to improve situational awareness and provide surgical decision support systems to medical teams. CAS analyzes data streams from available devices during surgery and communicates real-time knowledge to clinicians. Indeed, recent advances in computer vision and machine learning, particularly deep learning, paved the way for extensive research to develop CAS. In this work, a deep learning approach for analyzing laparoscopic videos for surgical phase recognition, tool classification, and weakly-supervised tool localization in laparoscopic videos was proposed. The ResNet-50 convolutional neural network (CNN) architecture was adapted by adding attention modules and fusing features from multiple stages to generate better-focused, generalized, and well-representative features. Then, a multi-map convolutional layer followed by tool-wise and spatial pooling operations was utilized to perform tool localization and generate tool presence confidences. Finally, the long short-term memory (LSTM) network was employed to model temporal information and perform tool classification and phase recognition. The proposed approach was evaluated on the Cholec80 dataset. The experimental results (i.e., 88.5% and 89.0% mean precision and recall for phase recognition, respectively, 95.6% mean average precision for tool presence detection, and a 70.1% F1-score for tool localization) demonstrated the ability of the model to learn discriminative features for all tasks. The performances revealed the importance of integrating attention modules and multi-stage feature fusion for more robust and precise detection of surgical phases and tools.
将智能上下文感知系统 (CAS) 应用于未来的手术室 (OR) 旨在提高情境感知能力,并为医疗团队提供手术决策支持系统。CAS 分析手术过程中来自可用设备的数据流,并向临床医生实时传达知识。事实上,计算机视觉和机器学习,特别是深度学习的最新进展为开发 CAS 铺平了道路。在这项工作中,提出了一种用于分析腹腔镜视频以进行手术阶段识别、工具分类以及在腹腔镜视频中进行弱监督工具定位的深度学习方法。通过添加注意力模块和融合来自多个阶段的特征,对 ResNet-50 卷积神经网络 (CNN) 架构进行了调整,以生成更聚焦、更通用和更具代表性的特征。然后,使用多图卷积层以及工具特定和空间池化操作来执行工具定位并生成工具存在置信度。最后,使用长短时记忆 (LSTM) 网络来建模时间信息并执行工具分类和阶段识别。该方法在 Cholec80 数据集上进行了评估。实验结果(即,相位识别的平均精度分别为 88.5%和 89.0%,工具存在检测的平均精度为 95.6%,工具定位的 F1 得分为 70.1%)表明了该模型学习所有任务的区分特征的能力。性能揭示了集成注意力模块和多阶段特征融合对于更稳健和精确的手术阶段和工具检测的重要性。