Dall-E在手部外科手术中的应用：探索ChatGPT图像生成的效用。

BACKGROUND: Artificial intelligence (AI) has significantly influenced various medical fields, including plastic surgery. Large language model (LLM) chatbots such as ChatGPT and text-to-image tools like Dall-E and GPT-4o are gaining broader adoption. This study explores the capabilities and limitations of these tools in hand surgery, focusing on their application in patient and medical education. METHODS: Utilizing Google Trends data, common search terms were identified and queried on ChatGPT-4.5 and ChatGPT-3.5 from the following categories: "Hand Anatomy", "Hand Fracture", "Hand Joint Injury", "Hand Tumor", and "Hand Dislocation". Responses were graded on a 1-5 scale for accuracy and evaluated using the Flesch-Kincaid Grade Level, Patient Education Materials Assessment Tool (PEMAT), and DISCERN instrument. GPT 4o, DALL-E 3, and DALL-E 2 illustrated visual representations of selected ChatGPT responses in each category, which were further evaluated. RESULTS: ChatGPT-4.5 achieved a DISCERN overall score of 3.80 ± 0.23. Its responses averaged 91.67 ± 0.29 for PEMAT understandability and 54.67 ± 0.55 for actionability. Accuracy was 4.47 ± 0.52, with a Flesch-Kincaid Grade Level of 9.26 ± 1.04. ChatGPT-4.5 consistently outperformed ChatGPT-3.5 across all evaluation metrics. For text-to-image generation, GPT-4o produced more accurate visuals compared to DALL-E 3 and DALL-E 2. CONCLUSIONS: This study highlights the strengths and limitations of ChatGPT-4.5 and GPT-4o in hand surgery education. While combining accurate text generation with image creation shows promise, these AI tools still need further refinement before widespread clinical adoption.

背景：人工智能（AI）已对包括整形手术在内的各个医学领域产生了重大影响。诸如ChatGPT之类的大语言模型（LLM）聊天机器人以及像Dall-E和GPT-4o这样的文本到图像工具正得到更广泛的应用。本研究探讨了这些工具在手外科中的能力和局限性，重点关注它们在患者和医学教育中的应用。方法：利用谷歌趋势数据，确定了常见搜索词，并在ChatGPT-4.5和ChatGPT-3.5上查询了以下类别：“手部解剖学”、“手部骨折”、“手部关节损伤”、“手部肿瘤”和“手部脱位”。对回答的准确性按1-5级评分，并使用弗莱什-金凯德年级水平、患者教育材料评估工具（PEMAT）和辨别工具进行评估。GPT 4o、DALL-E 3和DALL-E 2对每个类别中选定的ChatGPT回答进行了可视化展示，并进一步进行了评估。结果：ChatGPT-4.5的辨别总体评分为3.80±0.23。其回答的PEMAT可理解性平均为91.67±0.29，可操作性平均为54.67±0.55。准确性为4.47±0.52，弗莱什-金凯德年级水平为9.26±1.04。在所有评估指标上，ChatGPT-4.5始终优于ChatGPT-3.5。对于文本到图像生成，与DALL-E 3和DALL-E 2相比，GPT-4o生成的视觉效果更准确。结论：本研究突出了ChatGPT-4.5和GPT-4o在手外科教育中的优势和局限性。虽然将准确的文本生成与图像创建相结合显示出了前景，但这些人工智能工具在广泛临床应用之前仍需进一步完善。

新学期，新优惠

Suppr 超能文献

新学期，新优惠

Suppr 超能文献

Dall-E in hand surgery: Exploring the utility of ChatGPT image generation.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

推荐工具