Suppr超能文献

用于解释3D医学图像和视频的多模态生成式人工智能。

Multimodal generative AI for interpreting 3D medical images and videos.

作者信息

Lee Jung-Oh, Zhou Hong-Yu, Berzin Tyler M, Sodickson Daniel K, Rajpurkar Pranav

机构信息

Department of Radiology, Seoul National University Hospital, Seoul, Republic of Korea.

Department of Biomedical Informatics, Harvard Medical School, Boston, USA.

出版信息

NPJ Digit Med. 2025 May 13;8(1):273. doi: 10.1038/s41746-025-01649-4.

Abstract

This perspective proposes adapting video-text generative AI to 3D medical imaging (CT/MRI) and medical videos (endoscopy/laparoscopy) by treating 3D images as videos. The approach leverages modern video models to analyze multiple sequences simultaneously and provide real-time AI assistance during procedures. The paper examines medical imaging's unique characteristics (synergistic information, metadata, and world model), outlines applications in automated reporting, case retrieval, and education, and addresses challenges of limited datasets, benchmarks, and specialized training.

摘要

这一观点提议,通过将3D图像视为视频,使视频-文本生成式人工智能适用于3D医学成像(CT/磁共振成像)和医学视频(内窥镜检查/腹腔镜检查)。该方法利用现代视频模型同时分析多个序列,并在手术过程中提供实时人工智能辅助。本文研究了医学成像的独特特征(协同信息、元数据和世界模型),概述了在自动报告、病例检索和教育方面的应用,并探讨了数据集有限、基准测试和专业培训等挑战。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f147/12075794/9f0b139d0c17/41746_2025_1649_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验