Department of Computer Science, COMSATS University, Islamabad, Pakistan.
Department of Information Systems and Technology, Mid Sweden University, Sundsvall, Sweden.
J Digit Imaging. 2019 Dec;32(6):1027-1043. doi: 10.1007/s10278-019-00206-2.
Surgical telementoring systems have gained lots of interest, especially in remote locations. However, bandwidth constraint has been the primary bottleneck for efficient telementoring systems. This study aims to establish an efficient surgical telementoring system, where the qualified surgeon (mentor) provides real-time guidance and technical assistance for surgical procedures to the on-spot physician (surgeon). High Efficiency Video Coding (HEVC/H.265)-based video compression has shown promising results for telementoring applications. However, there is a trade-off between the bandwidth resources required for video transmission and quality of video received by the remote surgeon. In order to efficiently compress and transmit real-time surgical videos, a hybrid lossless-lossy approach is proposed where surgical incision region is coded in high quality whereas the background region is coded in low quality based on distance from the surgical incision region. For surgical incision region extraction, state-of-the-art deep learning (DL) architectures for semantic segmentation can be used. However, the computational complexity of these architectures is high resulting in large training and inference times. For telementoring systems, encoding time is crucial; therefore, very deep architectures are not suitable for surgical incision extraction. In this study, we propose a shallow convolutional neural network (S-CNN)-based segmentation approach that consists of encoder network only for surgical region extraction. The segmentation performance of S-CNN is compared with one of the state-of-the-art image segmentation networks (SegNet), and results demonstrate the effectiveness of the proposed network. The proposed telementoring system is efficient and explicitly considers the physiological nature of the human visual system to encode the video by providing good overall visual impact in the location of surgery. The results of the proposed S-CNN-based segmentation demonstrated a pixel accuracy of 97% and a mean intersection over union accuracy of 79%. Similarly, HEVC experimental results showed that the proposed surgical region-based encoding scheme achieved an average bitrate reduction of 88.8% at high-quality settings in comparison with default full-frame HEVC encoding. The average gain in encoding performance (signal-to-noise) of the proposed algorithm is 11.5 dB in the surgical region. The bitrate saving and visual quality of the proposed optimal bit allocation scheme are compared with the mean shift segmentation-based coding scheme for fair comparison. The results show that the proposed scheme maintains high visual quality in surgical incision region along with achieving good bitrate saving. Based on comparison and results, the proposed encoding algorithm can be considered as an efficient and effective solution for surgical telementoring systems for low-bandwidth networks.
手术远程指导系统已经引起了广泛关注,尤其是在偏远地区。然而,带宽限制一直是高效远程指导系统的主要瓶颈。本研究旨在建立一个高效的手术远程指导系统,让合格的外科医生(指导医师)能够实时为现场的医生(手术医师)提供手术指导和技术支持。基于高效率视频编码(HEVC/H.265)的视频压缩技术在远程指导应用中显示出了广阔的前景。然而,视频传输所需的带宽资源与远程手术医生接收到的视频质量之间存在权衡。为了高效地压缩和传输实时手术视频,我们提出了一种混合无损-有损的方法,根据与手术切口区域的距离,对手术切口区域进行高质量编码,对背景区域进行低质量编码。为了提取手术切口区域,我们可以使用最先进的语义分割深度学习(DL)架构。然而,这些架构的计算复杂度很高,导致训练和推理时间都很长。对于远程指导系统来说,编码时间至关重要,因此非常深的架构不适合用于手术切口提取。在本研究中,我们提出了一种基于浅层卷积神经网络(S-CNN)的分割方法,该方法仅由用于手术区域提取的编码器网络组成。将 S-CNN 的分割性能与最先进的图像分割网络(SegNet)之一进行了比较,结果表明了所提出网络的有效性。所提出的远程指导系统是高效的,并且明确考虑了人类视觉系统的生理特性,通过在手术部位提供良好的整体视觉效果来对视频进行编码。基于 S-CNN 的分割结果表明,像素准确率为 97%,平均交并比准确率为 79%。同样,HEVC 实验结果表明,与默认的全帧 HEVC 编码相比,所提出的基于手术区域的编码方案在高质量设置下平均可将比特率降低 88.8%。所提出算法的编码性能(信噪比)平均增益为 11.5dB。所提出的最优比特分配方案的比特率节省和视觉质量与基于均值漂移分割的编码方案进行了公平比较。结果表明,所提出的方案在保持手术切口区域高质量的同时,实现了良好的比特率节省。基于比较和结果,所提出的编码算法可以被视为低带宽网络中用于手术远程指导系统的高效、有效的解决方案。