Chen Kun, Xu Wengui, Li Xiaofeng
Department of Nuclear Medicine, Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, Hangzhou, Zhejiang 310022, China (K.C.).
Department of Molecular Imaging and Nuclear Medicine, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Huanhuxi Road, Hexi District, Tianjin 300060, China (W.X., X.L.); Tianjin's Clinical Research Center for Cancer, Tianjin 300060, China (W.X., X.L.).
Acad Radiol. 2025 Feb;32(2):624-633. doi: 10.1016/j.acra.2024.08.052. Epub 2024 Sep 7.
To compare the performance of large language model (LLM) based Gemini and Generative Pre-trained Transformers (GPTs) in data mining and generating structured reports based on free-text PET/CT reports for breast cancer after user-defined tasks.
Breast cancer patients (mean age, 50 years ± 11 [SD]; all female) who underwent consecutive F-FDG PET/CT for follow-up between July 2005 and October 2023 were retrospectively included in the study. A total of twenty reports from 10 patients were used to train user-defined text prompts for Gemini and GPTs, by which structured PET/CT reports were generated. The natural language processing (NLP) generated structured reports and the structured reports annotated by nuclear medicine physicians were compared in terms of data extraction accuracy and capacity of progress decision-making. Statistical methods, including chi-square test, McNemar test and paired samples t-test, were employed in the study.
The structured PET/CT reports for 131 patients were generated by using the two NLP techniques, including Gemini and GPTs. In general, GPTs exhibited superiority over Gemini in data mining in terms of primary lesion size (89.6% vs. 53.8%, p < 0.001) and metastatic lesions (96.3% vs 89.6%, p < 0.001). Moreover, GPTs outperformed Gemini in making decision for progress (p < 0.001) and semantic similarity (F1 score 0.930 vs 0.907, p < 0.001) for reports.
GPTs outperformed Gemini in generating structured reports based on free-text PET/CT reports, which is potentially applied in clinical practice.
The data used and/or analyzed during the current study are available from the corresponding author on reasonable request.
比较基于大语言模型(LLM)的Gemini和生成式预训练变换器(GPT)在用户定义任务后,针对乳腺癌的自由文本PET/CT报告进行数据挖掘和生成结构化报告的性能。
回顾性纳入2005年7月至2023年10月期间连续接受F-FDG PET/CT随访的乳腺癌患者(平均年龄50岁±11[标准差];均为女性)。使用来自10名患者的20份报告为Gemini和GPT训练用户定义的文本提示,由此生成结构化PET/CT报告。比较自然语言处理(NLP)生成的结构化报告与核医学医师注释的结构化报告在数据提取准确性和进展决策能力方面的差异。研究采用了包括卡方检验、McNemar检验和配对样本t检验在内的统计方法。
使用Gemini和GPT这两种NLP技术生成了131例患者的结构化PET/CT报告。总体而言,在原发性病变大小(89.6%对53.8%,p<0.001)和转移性病变(96.3%对89.6%,p<0.00)的数据挖掘方面,GPT表现优于Gemini。此外,在报告的进展决策(p<0.001)和语义相似性(F1分数0.930对0.907)方面,GPT也优于Gemini。
在基于自由文本PET/CT报告生成结构化报告方面,GPT表现优于Gemini,这在临床实践中具有潜在应用价值。
本研究中使用和/或分析的数据可在合理请求下从相应作者处获得。