Yang Zixin, Simon Richard, Linte Cristian
Center for Imaging Science, Rochester Institute of Technology Rochester, NY 14623, USA.
Department of Biomedical Engineering, Rochester Institute of Technology Rochester, NY 14623, USA.
Proc SPIE Int Soc Opt Eng. 2023 Feb;12466. doi: 10.1117/12.2654804. Epub 2023 Apr 3.
Stereo matching methods that enable depth estimation are crucial for visualization enhancement applications in computer-assisted surgery (CAS). Learning-based stereo matching methods are promising to predict accurate results for applications involving video images. However, they require a large amount of training data, and their performance may be degraded due to domain shifts. Maintaining robustness and improving performance of learning-based methods are still open problems. To overcome the limitations of learning-based methods, we propose a disparity refinement framework consisting of a local disparity refinement method and a global disparity refinement method to improve the results of learning-based stereo matching methods in a cross-domain setting. Those learning-based stereo matching methods are pre-trained on a large public dataset of natural images and are tested on a dataset of laparoscopic images. Results from the SERV-CT dataset showed that our proposed framework can effectively refine disparity maps on an unseen dataset even when they are corrupted by noise, and without compromising correct prediction, provided the network can generalize well on unseen datasets. As such, our proposed disparity refinement framework has the potential to work with learning-based methods to achieve robust and accurate disparity prediction. Yet, as a large laparoscopic dataset for training learning-based methods does not exist and the generalization ability of networks remains to be improved, it will be beneficial to incorporate the proposed disparity refinement framework into existing networks for more accurate and robust depth estimation.
能够实现深度估计的立体匹配方法对于计算机辅助手术(CAS)中的可视化增强应用至关重要。基于学习的立体匹配方法有望为涉及视频图像的应用预测准确结果。然而,它们需要大量的训练数据,并且其性能可能会因域转移而下降。保持基于学习的方法的鲁棒性并提高其性能仍然是未解决的问题。为了克服基于学习的方法的局限性,我们提出了一个视差细化框架,该框架由局部视差细化方法和全局视差细化方法组成,以在跨域设置中改进基于学习的立体匹配方法的结果。那些基于学习的立体匹配方法在一个大型自然图像公共数据集上进行预训练,并在腹腔镜图像数据集上进行测试。来自SERV-CT数据集的结果表明,我们提出的框架即使在视差图被噪声破坏的情况下,也能有效地在未见过的数据集上细化视差图,并且在不影响正确预测的情况下,前提是网络能够在未见过的数据集上很好地泛化。因此,我们提出的视差细化框架有可能与基于学习的方法一起工作,以实现鲁棒且准确的视差预测。然而,由于不存在用于训练基于学习的方法的大型腹腔镜数据集,并且网络的泛化能力仍有待提高,将提出的视差细化框架纳入现有网络以进行更准确和鲁棒的深度估计将是有益的。