Suppr超能文献

SurGrID:通过场景图到图像扩散实现可控手术模拟

SurGrID: controllable surgical simulation via Scene Graph to Image Diffusion.

作者信息

Frisch Yannik, Sivakumar Ssharvien Kumar, Köksal Çağhan, Böhm Elsa, Wagner Felix, Gericke Adrian, Ghazaei Ghazal, Mukhopadhyay Anirban

机构信息

TU Darmstadt, Fraunhoferstr. 5, 64297, Darmstadt, Germany.

Universitätsmedizin Mainz, Langenbeckstr. 1, 55131, Mainz, Germany.

出版信息

Int J Comput Assist Radiol Surg. 2025 May 21. doi: 10.1007/s11548-025-03397-y.

Abstract

PURPOSE

Surgical simulation offers a promising addition to conventional surgical training. However, available simulation tools lack photorealism and rely on hard-coded behaviour. Denoising Diffusion Models are a promising alternative for high-fidelity image synthesis, but existing state-of-the-art conditioning methods fall short in providing precise control or interactivity over the generated scenes.

METHODS

We introduce SurGrID, a Scene Graph to Image Diffusion Model, allowing for controllable surgical scene synthesis by leveraging Scene Graphs. These graphs encode a surgical scene's components' spatial and semantic information, which are then translated into an intermediate representation using our novel pre-training step that explicitly captures local and global information.

RESULTS

Our proposed method improves the fidelity of generated images and their coherence with the graph input over the state of the art. Further, we demonstrate the simulation's realism and controllability in a user assessment study involving clinical experts.

CONCLUSION

Scene Graphs can be effectively used for precise and interactive conditioning of Denoising Diffusion Models for simulating surgical scenes, enabling high-fidelity and interactive control over the generated content.

摘要

目的

手术模拟为传统手术训练提供了一个很有前景的补充。然而,现有的模拟工具缺乏照片般的真实感,且依赖硬编码行为。去噪扩散模型是高保真图像合成的一个有前途的替代方案,但现有的最先进的条件方法在对生成的场景提供精确控制或交互性方面存在不足。

方法

我们引入了SurGrID,一种从场景图到图像的扩散模型,通过利用场景图实现可控的手术场景合成。这些图编码了手术场景中组件的空间和语义信息,然后使用我们新颖的预训练步骤将其转换为中间表示,该步骤明确捕获局部和全局信息。

结果

我们提出的方法提高了生成图像的保真度及其与图输入的一致性,优于现有技术。此外,我们在一项涉及临床专家的用户评估研究中展示了模拟的真实感和可控性。

结论

场景图可有效地用于对去噪扩散模型进行精确和交互式的条件设定,以模拟手术场景,从而实现对生成内容的高保真和交互式控制。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验