• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

IQAGPT:使用视觉语言模型和ChatGPT模型进行计算机断层扫描图像质量评估

IQAGPT: computed tomography image quality assessment with vision-language and ChatGPT models.

作者信息

Chen Zhihao, Hu Bin, Niu Chuang, Chen Tao, Li Yuxin, Shan Hongming, Wang Ge

机构信息

Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, 200433, China.

Department of Radiology, Huashan Hospital, Fudan University, Shanghai, 200040, China.

出版信息

Vis Comput Ind Biomed Art. 2024 Aug 5;7(1):20. doi: 10.1186/s42492-024-00171-w.

DOI:10.1186/s42492-024-00171-w
PMID:39101954
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11300764/
Abstract

Large language models (LLMs), such as ChatGPT, have demonstrated impressive capabilities in various tasks and attracted increasing interest as a natural language interface across many domains. Recently, large vision-language models (VLMs) that learn rich vision-language correlation from image-text pairs, like BLIP-2 and GPT-4, have been intensively investigated. However, despite these developments, the application of LLMs and VLMs in image quality assessment (IQA), particularly in medical imaging, remains unexplored. This is valuable for objective performance evaluation and potential supplement or even replacement of radiologists' opinions. To this end, this study introduces IQAGPT, an innovative computed tomography (CT) IQA system that integrates image-quality captioning VLM with ChatGPT to generate quality scores and textual reports. First, a CT-IQA dataset comprising 1,000 CT slices with diverse quality levels is professionally annotated and compiled for training and evaluation. To better leverage the capabilities of LLMs, the annotated quality scores are converted into semantically rich text descriptions using a prompt template. Second, the image-quality captioning VLM is fine-tuned on the CT-IQA dataset to generate quality descriptions. The captioning model fuses image and text features through cross-modal attention. Third, based on the quality descriptions, users verbally request ChatGPT to rate image-quality scores or produce radiological quality reports. Results demonstrate the feasibility of assessing image quality using LLMs. The proposed IQAGPT outperformed GPT-4 and CLIP-IQA, as well as multitask classification and regression models that solely rely on images.

摘要

诸如ChatGPT这样的大语言模型(LLMs)在各种任务中展现出了令人印象深刻的能力,并作为跨多个领域的自然语言接口吸引了越来越多的关注。最近,像BLIP-2和GPT-4这样从图像-文本对中学习丰富视觉-语言相关性的大型视觉-语言模型(VLMs)受到了深入研究。然而,尽管有这些进展,LLMs和VLMs在图像质量评估(IQA)中的应用,特别是在医学成像领域,仍未得到探索。这对于客观性能评估以及潜在补充甚至替代放射科医生的意见具有重要价值。为此,本研究引入了IQAGPT,这是一种创新的计算机断层扫描(CT)IQA系统,它将图像质量字幕VLM与ChatGPT集成,以生成质量分数和文本报告。首先,一个包含1000个具有不同质量水平的CT切片的CT-IQA数据集被专业注释和整理,用于训练和评估。为了更好地利用LLMs的能力,使用提示模板将注释的质量分数转换为语义丰富的文本描述。其次,在CT-IQA数据集上对图像质量字幕VLM进行微调,以生成质量描述。字幕模型通过跨模态注意力融合图像和文本特征。第三,基于质量描述,用户通过口头请求ChatGPT对图像质量分数进行评分或生成放射学质量报告。结果证明了使用LLMs评估图像质量的可行性。所提出的IQAGPT优于GPT-4和CLIP-IQA,以及仅依赖图像的多任务分类和回归模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c64/11300764/cc60cbde65ba/42492_2024_171_Fig13_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c64/11300764/975c4e637acb/42492_2024_171_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c64/11300764/5f11952bd12c/42492_2024_171_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c64/11300764/757cd1898ad7/42492_2024_171_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c64/11300764/d5374ed05273/42492_2024_171_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c64/11300764/f2654b93c83a/42492_2024_171_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c64/11300764/57a1465960d6/42492_2024_171_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c64/11300764/bcca5dc649b6/42492_2024_171_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c64/11300764/5698bf5376d9/42492_2024_171_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c64/11300764/6b97dc665708/42492_2024_171_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c64/11300764/1e72ed52a7eb/42492_2024_171_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c64/11300764/1393245a03d6/42492_2024_171_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c64/11300764/b98ac69d7359/42492_2024_171_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c64/11300764/cc60cbde65ba/42492_2024_171_Fig13_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c64/11300764/975c4e637acb/42492_2024_171_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c64/11300764/5f11952bd12c/42492_2024_171_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c64/11300764/757cd1898ad7/42492_2024_171_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c64/11300764/d5374ed05273/42492_2024_171_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c64/11300764/f2654b93c83a/42492_2024_171_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c64/11300764/57a1465960d6/42492_2024_171_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c64/11300764/bcca5dc649b6/42492_2024_171_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c64/11300764/5698bf5376d9/42492_2024_171_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c64/11300764/6b97dc665708/42492_2024_171_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c64/11300764/1e72ed52a7eb/42492_2024_171_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c64/11300764/1393245a03d6/42492_2024_171_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c64/11300764/b98ac69d7359/42492_2024_171_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c64/11300764/cc60cbde65ba/42492_2024_171_Fig13_HTML.jpg

相似文献

1
IQAGPT: computed tomography image quality assessment with vision-language and ChatGPT models.IQAGPT:使用视觉语言模型和ChatGPT模型进行计算机断层扫描图像质量评估
Vis Comput Ind Biomed Art. 2024 Aug 5;7(1):20. doi: 10.1186/s42492-024-00171-w.
2
Learning to Make Rare and Complex Diagnoses With Generative AI Assistance: Qualitative Study of Popular Large Language Models.利用生成式人工智能辅助学习罕见且复杂的诊断:对流行的大型语言模型的定性研究。
JMIR Med Educ. 2024 Feb 13;10:e51391. doi: 10.2196/51391.
3
Triage Performance Across Large Language Models, ChatGPT, and Untrained Doctors in Emergency Medicine: Comparative Study.分诊表现比较:大型语言模型、ChatGPT 和未经训练的急诊医生:一项对比研究。
J Med Internet Res. 2024 Jun 14;26:e53297. doi: 10.2196/53297.
4
Low-dose computed tomography perceptual image quality assessment.低剂量计算机断层扫描感知图像质量评估。
Med Image Anal. 2025 Jan;99:103343. doi: 10.1016/j.media.2024.103343. Epub 2024 Sep 6.
5
Evaluating ChatGPT-4's Diagnostic Accuracy: Impact of Visual Data Integration.评估ChatGPT-4的诊断准确性:视觉数据整合的影响。
JMIR Med Inform. 2024 Apr 9;12:e55627. doi: 10.2196/55627.
6
Potential of ChatGPT and GPT-4 for Data Mining of Free-Text CT Reports on Lung Cancer.ChatGPT 和 GPT-4 在挖掘肺癌 CT 报告自由文本数据方面的潜力
Radiology. 2023 Sep;308(3):e231362. doi: 10.1148/radiol.231362.
7
A comprehensive evaluation of large Language models on benchmark biomedical text processing tasks.对基准生物医学文本处理任务中大型语言模型的全面评估。
Comput Biol Med. 2024 Mar;171:108189. doi: 10.1016/j.compbiomed.2024.108189. Epub 2024 Feb 20.
8
Quality of Answers of Generative Large Language Models Versus Peer Users for Interpreting Laboratory Test Results for Lay Patients: Evaluation Study.生成式大语言模型与同行用户对解释非专业患者实验室检测结果的答案质量比较:评估研究。
J Med Internet Res. 2024 Apr 17;26:e56655. doi: 10.2196/56655.
9
Lung Cancer Staging Using Chest CT and FDG PET/CT Free-Text Reports: Comparison Among Three ChatGPT Large Language Models and Six Human Readers of Varying Experience.使用胸部CT和FDG PET/CT自由文本报告进行肺癌分期:三种ChatGPT大语言模型与六位不同经验水平的人类读者的比较
AJR Am J Roentgenol. 2024 Dec;223(6):e2431696. doi: 10.2214/AJR.24.31696. Epub 2024 Sep 4.
10
Evaluation of large language models in breast cancer clinical scenarios: a comparative analysis based on ChatGPT-3.5, ChatGPT-4.0, and Claude2.评估大语言模型在乳腺癌临床场景中的应用:基于 ChatGPT-3.5、ChatGPT-4.0 和 Claude2 的比较分析
Int J Surg. 2024 Apr 1;110(4):1941-1950. doi: 10.1097/JS9.0000000000001066.

引用本文的文献

1
CMMIQA: a prompt-driven cross-modality multi-organ medical image quality assessment model.CMMIQA:一种基于提示的跨模态多器官医学图像质量评估模型。
Quant Imaging Med Surg. 2025 Jul 1;15(7):6326-6339. doi: 10.21037/qims-2025-127. Epub 2025 Jun 30.
2
Rectal-RadioSAM: Large model-assisted multi-parametric magnetic resonance imaging pipeline for predicting response to neoadjuvant chemoradiotherapy in rectal cancer without human intervention.直肠-放射性核素显像剂辅助模型:用于预测直肠癌新辅助放化疗反应的大型模型辅助多参数磁共振成像流程,无需人工干预。
Phys Imaging Radiat Oncol. 2025 Jun 20;35:100797. doi: 10.1016/j.phro.2025.100797. eCollection 2025 Jul.

本文引用的文献

1
LIT-Former: Linking In-Plane and Through-Plane Transformers for Simultaneous CT Image Denoising and Deblurring.LIT-Former:用于同时进行CT图像去噪和去模糊的平面内与平面间变压器连接网络
IEEE Trans Med Imaging. 2024 May;43(5):1880-1894. doi: 10.1109/TMI.2024.3351723. Epub 2024 May 2.
2
Self-supervised multi-modal training from uncurated images and reports enables monitoring AI in radiology.来自未经整理的图像和报告的自监督多模态训练能够实现放射学中的人工智能监测。
Med Image Anal. 2024 Jan;91:103021. doi: 10.1016/j.media.2023.103021. Epub 2023 Nov 7.
3
CoreDiff: Contextual Error-Modulated Generalized Diffusion Model for Low-Dose CT Denoising and Generalization.
CoreDiff:用于低剂量 CT 去噪和泛化的上下文错误调制广义扩散模型。
IEEE Trans Med Imaging. 2024 Feb;43(2):745-759. doi: 10.1109/TMI.2023.3320812. Epub 2024 Feb 2.
4
Survey of methods and principles in three-dimensional reconstruction from two-dimensional medical images.二维医学图像三维重建的方法与原理综述
Vis Comput Ind Biomed Art. 2023 Jul 27;6(1):15. doi: 10.1186/s42492-023-00142-7.
5
Vision transformer architecture and applications in digital health: a tutorial and survey.视觉Transformer架构及其在数字健康中的应用:教程与综述
Vis Comput Ind Biomed Art. 2023 Jul 10;6(1):14. doi: 10.1186/s42492-023-00140-9.
6
Editorial: advances in deep learning techniques for biomedical imaging.社论:生物医学成像深度学习技术的进展
Vis Comput Ind Biomed Art. 2023 Jun 21;6(1):12. doi: 10.1186/s42492-023-00139-2.
7
Translating radiology reports into plain language using ChatGPT and GPT-4 with prompt learning: results, limitations, and potential.使用ChatGPT和GPT-4通过提示学习将放射学报告翻译成通俗易懂的语言:结果、局限性和潜力。
Vis Comput Ind Biomed Art. 2023 May 18;6(1):9. doi: 10.1186/s42492-023-00136-5.
8
Comparison of supervised-learning approaches for designing a channelized observer for image quality assessment in CT.用于设计 CT 图像质量评估通道化观察者的监督学习方法比较。
Med Phys. 2023 Jul;50(7):4282-4295. doi: 10.1002/mp.16227. Epub 2023 Jan 31.
9
Deep learning tomographic reconstruction through hierarchical decomposition of domain transforms.通过域变换的层次分解进行深度学习断层重建。
Vis Comput Ind Biomed Art. 2022 Dec 9;5(1):30. doi: 10.1186/s42492-022-00127-y.
10
Cardiac CT blooming artifacts: clinical significance, root causes and potential solutions.心脏CT的伪影:临床意义、根源及潜在解决方案
Vis Comput Ind Biomed Art. 2022 Dec 9;5(1):29. doi: 10.1186/s42492-022-00125-0.