Suppr超能文献

评估ChatGPT-4o作为多学科肉瘤肿瘤委员会决策支持工具的效果:各专业表现参差不齐

Evaluating ChatGPT-4o as a decision support tool in multidisciplinary sarcoma tumor boards: heterogeneous performance across various specialties.

作者信息

Ammo Tekoshin, Guillaume Vincent G J, Hofmann Ulf Krister, Ulmer Norma M, Buenting Nina, Laenger Florian, Beier Justus P, Leypold Tim

机构信息

Department of Plastic Surgery, Hand and Reconstructive Surgery, University Hospital Rheinisch-Westfälische Technische Hochschule (RWTH) Aachen, Aachen, Germany.

Department of Orthopedics, Trauma and Reconstructive Surgery, Division of Arthroplasty, University Hospital Rheinisch-Westfälische Technische Hochschule (RWTH) Aachen, Aachen, Germany.

出版信息

Front Oncol. 2025 Jan 17;14:1526288. doi: 10.3389/fonc.2024.1526288. eCollection 2024.

Abstract

BACKGROUND AND OBJECTIVES

Since the launch of ChatGPT in 2023, large language models have attracted substantial interest to be deployed in the health care sector. This study evaluates the performance of ChatGPT-4o as a support tool for decision-making in multidisciplinary sarcoma tumor boards.

METHODS

We created five sarcoma patient cases mimicking real-world scenarios and prompted ChatGPT-4o to issue tumor board decisions. These recommendations were independently assessed by a multidisciplinary panel, consisting of an orthopedic surgeon, plastic surgeon, radiation oncologist, radiologist, and pathologist. Assessments were graded on a Likert scale from 1 (completely disagree) to 5 (completely agree) across five categories: understanding, therapy/diagnostic recommendation, aftercare recommendation, summarization, and support tool effectiveness.

RESULTS

The mean score for ChatGPT-4o performance was 3.76, indicating moderate effectiveness. Surgical specialties received the highest score, with a mean score of 4.48, while diagnostic specialties (radiology/pathology) performed considerably better than the radiation oncology specialty, which performed poorly.

CONCLUSIONS

This study provides initial insights into the use of prompt-engineered large language models as decision support tools in sarcoma tumor boards. ChatGPT-4o recommendations regarding surgical specialties performed best while ChatGPT-4o struggled to give valuable advice in the other tested specialties. Clinicians should understand both the advantages and limitations of this technology for effective integration into clinical practice.

摘要

背景与目的

自2023年ChatGPT推出以来,大语言模型在医疗保健领域的应用引起了广泛关注。本研究评估了ChatGPT-4o作为多学科肉瘤肿瘤委员会决策支持工具的性能。

方法

我们创建了五个模拟现实场景的肉瘤患者病例,并促使ChatGPT-4o做出肿瘤委员会决策。这些建议由一个多学科小组独立评估,该小组由一名骨科医生、一名整形外科医生、一名放射肿瘤学家、一名放射科医生和一名病理学家组成。评估按照李克特量表从1(完全不同意)到5(完全同意)分为五类:理解、治疗/诊断建议、术后护理建议、总结和支持工具有效性。

结果

ChatGPT-4o性能的平均得分为3.76,表明其有效性中等。外科专业得分最高,平均分为4.48,而诊断专业(放射科/病理科)的表现明显优于放射肿瘤学专业,放射肿瘤学专业表现较差。

结论

本研究为在肉瘤肿瘤委员会中使用提示工程化大语言模型作为决策支持工具提供了初步见解。ChatGPT-4o关于外科专业的建议表现最佳,而ChatGPT-4o在其他测试专业中难以给出有价值的建议。临床医生应了解该技术的优缺点,以便有效地将其整合到临床实践中。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2d73/11782276/56eca7684b55/fonc-14-1526288-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验