Suppr超能文献

使用变分自编码器(VAE)、生成对抗网络(GAN)和扩散模型架构进行合成科学图像生成。

Synthetic Scientific Image Generation with VAE, GAN, and Diffusion Model Architectures.

作者信息

Sordo Zineb, Chagnon Eric, Hu Zixi, Donatelli Jeffrey J, Andeer Peter, Nico Peter S, Northen Trent, Ushizima Daniela

机构信息

Applied Math and Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA.

Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA.

出版信息

J Imaging. 2025 Jul 26;11(8):252. doi: 10.3390/jimaging11080252.

Abstract

Generative AI (genAI) has emerged as a powerful tool for synthesizing diverse and complex image data, offering new possibilities for scientific imaging applications. This review presents a comprehensive comparative analysis of leading generative architectures, ranging from Variational Autoencoders (VAEs) to Generative Adversarial Networks (GANs) on through to Diffusion Models, in the context of scientific image synthesis. We examine each model's foundational principles, recent architectural advancements, and practical trade-offs. Our evaluation, conducted on domain-specific datasets including microCT scans of rocks and composite fibers, as well as high-resolution images of plant roots, integrates both quantitative metrics (SSIM, LPIPS, FID, CLIPScore) and expert-driven qualitative assessments. Results show that GANs, particularly StyleGAN, produce images with high perceptual quality and structural coherence. Diffusion-based models for inpainting and image variation, such as DALL-E 2, delivered high realism and semantic alignment but generally struggled in balancing visual fidelity with scientific accuracy. Importantly, our findings reveal limitations of standard quantitative metrics in capturing scientific relevance, underscoring the need for domain-expert validation. We conclude by discussing key challenges such as model interpretability, computational cost, and verification protocols, and discuss future directions where generative AI can drive innovation in data augmentation, simulation, and hypothesis generation in scientific research.

摘要

生成式人工智能(genAI)已成为合成多样且复杂图像数据的强大工具,为科学成像应用提供了新的可能性。本综述对领先的生成架构进行了全面的比较分析,范围从变分自编码器(VAE)到生成对抗网络(GAN),直至扩散模型,均在科学图像合成的背景下进行探讨。我们研究了每个模型的基本原理、近期架构进展以及实际权衡。我们在特定领域数据集上进行评估,这些数据集包括岩石和复合纤维的微型计算机断层扫描(microCT),以及植物根系的高分辨率图像,评估整合了定量指标(结构相似性指数(SSIM)、学习感知图像补丁相似度(LPIPS)、弗雷歇 inception 距离(FID)、CLIP分数)和专家驱动的定性评估。结果表明,GAN,尤其是风格GAN(StyleGAN),生成的图像具有较高的感知质量和结构连贯性。用于图像修复和图像变体生成的基于扩散的模型,如DALL-E 2,具有较高的逼真度和语义对齐度,但在平衡视觉保真度与科学准确性方面通常存在困难。重要的是,我们的研究结果揭示了标准定量指标在捕捉科学相关性方面的局限性,强调了领域专家验证的必要性。我们通过讨论模型可解释性、计算成本和验证协议等关键挑战来得出结论,并讨论生成式人工智能在科学研究中的数据增强、模拟和假设生成方面推动创新的未来方向。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9bb0/12387873/54a3904141ac/jimaging-11-00252-g008.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验