Department of Mechanical Engineering, Vanderbilt University, 2301 Vanderbilt Place, PMB 351826, Nashville, TN 37235-1826, USA.
Artif Intell Med. 2013 Nov;59(3):185-96. doi: 10.1016/j.artmed.2013.09.002. Epub 2013 Oct 10.
Colorectal cancer is one of the leading causes of cancer-related deaths in the world, although it can be effectively treated if detected early. Teleoperated flexible endoscopes are an emerging technology to ease patient apprehension about the procedure, and subsequently increase compliance. Essential to teleoperation is robust feedback reflecting the change in pose (i.e., position and orientation) of the tip of the endoscope. The goal of this study is to first describe a novel image-based tracking system for teleoperated flexible endoscopes, and subsequently determine its viability in a clinical setting. The proposed approach leverages artificial neural networks (ANNs) to learn the mapping that links the optical flow between two sequential images to the change in the pose of the camera. Secondly, the study investigates for the first time how narrow band illumination (NBI) - today available in commercial gastrointestinal endoscopes - can be applied to enhance feature extraction, and quantify the effect of NBI and white light illumination (WLI), as well as their color information, on the strength of features extracted from the endoscopic camera stream.
In order to provide the best features for the neural networks to learn the change in pose based on the image stream, we investigated two different imaging modalities - WLI and NBI - and we applied two different spatial partitions - lumen-centered and grid-based - to create descriptors used as input to the ANNs. An experiment was performed to compare the error of these four variations, measured in root mean square error (RMSE) from ground truth given by a robotic arm, to that of a commercial state-of-the-art magnetic tracker. The viability of this technique for a clinical setting was then tested using the four ANN variations, a magnetic tracker, and a commercial colonoscope. The trial was performed by an expert endoscopist (>2000 lifetime procedures) on a colonoscopy training model with porcine blood, and the RMSE of the ANN output was calculated with respect to the magnetic tracker readings. Using the image stream obtained from the commercial endoscope, the strength of features extracted was evaluated.
In the first experiment, the best ANNs resulted from grid-based partitioning under WLI (2.42mm RMSE) for position, and from lumen-centered partitioning under NBI (1.69° RMSE) for rotation. By comparison, the performance of the tracker was 2.49mm RMSE in position and 0.89° RMSE in rotation. The trial with the commercial endoscope indicated that lumen-centered partitioning was the best overall, while NBI outperformed WLI in terms of illumination modality. The performance of lumen-centered partitioning with NBI was 1.03±0.8mm RMSE in positional degrees of freedom (DOF), and 1.26±0.98° RMSE in rotational DOF, while with WLI, the performance was 1.56±1.15mm RMSE in positional DOF and 2.45±1.90° RMSE in rotational DOF. Finally, the features extracted under NBI were found to be twice as strong as those extracted under WLI, but no significance in feature strengths was observed between a grayscale version of the image, and the red, blue, and green color channels.
This work demonstrates that both WLI and NBI, combined with feature partitioning based on the anatomy of the colon, provide valid mechanisms for endoscopic camera pose estimation via image stream. Illumination provided by WLI and NBI produce ANNs with similar performance which are comparable to that of a state-of-the-art magnetic tracker. However, NBI produces features that are stronger than WLI, which enables more robust feature tracking, and better performance of the ANN in terms of accuracy. Thus, NBI with lumen-centered partitioning resulted the best approach among the different variations tested for vision-based pose estimation. The proposed approach takes advantage of components already available in commercial gastrointestinal endoscopes to provide accurate feedback about the motion of the tip of the endoscope. This solution may serve as an enabling technology for closed-loop control of teleoperated flexible endoscopes.
尽管早期发现可以有效治疗,但结直肠癌仍是全球癌症相关死亡的主要原因之一。远程操作的柔性内窥镜是一种新兴技术,可以减轻患者对手术的担忧,从而提高依从性。远程操作的关键是提供反映内窥镜尖端姿态(即位置和方向)变化的稳健反馈。本研究的目的是首先描述一种用于远程操作的柔性内窥镜的新型基于图像的跟踪系统,然后确定其在临床环境中的可行性。所提出的方法利用人工神经网络(ANNs)来学习将两个连续图像之间的光流映射到相机姿态变化的映射。其次,该研究首次调查了窄带照明(NBI)-今天在商业胃肠道内窥镜中可用-如何应用于增强特征提取,并量化 NBI 和白光照明(WLI)的效果,以及它们的颜色信息,对从内窥镜相机流中提取的特征的强度。
为了为神经网络提供基于图像流学习姿态变化的最佳特征,我们研究了两种不同的成像方式-WLI 和 NBI-并应用了两种不同的空间分区-腔中心和网格-来创建描述符,作为输入到 ANNs。进行了一项实验,比较了这四种变化的误差,以机器人臂提供的地面实况的均方根误差(RMSE)来衡量,与商业最先进的磁跟踪器的误差。然后,使用四种 ANN 变化、磁跟踪器和商业结肠镜检查,测试了该技术在临床环境中的可行性。该试验由一位经验丰富的内镜医师(>2000 例终生手术)在带有猪血的结肠镜检查训练模型上进行,并用磁跟踪器读数计算 ANN 输出的 RMSE。使用从商业内窥镜获得的图像流,评估提取的特征的强度。
在第一个实验中,基于 WLI 的网格分区(位置为 2.42mm RMSE)和基于 NBI 的腔中心分区(旋转为 1.69° RMSE)的最佳 ANN 结果。相比之下,跟踪器的性能为 2.49mm RMSE 位置和 0.89° RMSE 旋转。使用商业内窥镜的试验表明,腔中心分区总体上是最好的,而 NBI 在照明方式方面优于 WLI。腔中心分区与 NBI 的性能为 1.03±0.8mm RMSE 位置自由度(DOF)和 1.26±0.98° RMSE 旋转 DOF,而 WLI 的性能为 1.56±1.15mm RMSE 位置 DOF 和 2.45±1.90° RMSE 旋转 DOF。最后,发现 NBI 下提取的特征比 WLI 下提取的特征强两倍,但在图像的灰度版本与红色、蓝色和绿色通道之间没有观察到特征强度的显著差异。
这项工作表明,WLI 和 NBI 结合基于结肠解剖结构的特征分区,为通过图像流提供了有效的内窥镜相机姿态估计机制。WLI 和 NBI 提供的照明产生了与最先进的磁跟踪器性能相当的具有类似性能的 ANN。然而,NBI 产生的特征强于 WLI,这使得特征跟踪更加稳健,并且 ANN 在准确性方面的性能更好。因此,在不同的测试变体中,NBI 与腔中心分区是用于基于视觉的姿态估计的最佳方法。所提出的方法利用商业胃肠道内窥镜中已经可用的组件,提供关于内窥镜尖端运动的准确反馈。该解决方案可作为远程操作的柔性内窥镜闭环控制的使能技术。