Suppr超能文献

利用 ChatGPT 提高整形外科网页的可读性并实现内容分析自动化。

Improving Readability and Automating Content Analysis of Plastic Surgery Webpages With ChatGPT.

机构信息

Division of Plastic and Reconstructive Surgery, Department of Surgery, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts.

Department of Plastic and Reconstructive Surgery, Ohio State University Wexner Medical Center, Columbus, Ohio.

出版信息

J Surg Res. 2024 Jul;299:103-111. doi: 10.1016/j.jss.2024.04.006. Epub 2024 May 14.

Abstract

INTRODUCTION

The quality and readability of online health information are sometimes suboptimal, reducing their usefulness to patients. Manual evaluation of online medical information is time-consuming and error-prone. This study automates content analysis and readability improvement of private-practice plastic surgery webpages using ChatGPT.

METHODS

The first 70 Google search results of "breast implant size factors" and "breast implant size decision" were screened. ChatGPT 3.5 and 4.0 were utilized with two prompts (1: general, 2: specific) to automate content analysis and rewrite webpages with improved readability. ChatGPT content analysis outputs were classified as hallucination (false positive), accurate (true positive or true negative), or omission (false negative) using human-rated scores as a benchmark. Six readability metric scores of original and revised webpage texts were compared.

RESULTS

Seventy-five webpages were included. Significant improvements were achieved from baseline in six readability metric scores using a specific-instruction prompt with ChatGPT 3.5 (all P ≤ 0.05). No further improvements in readability scores were achieved with ChatGPT 4.0. Rates of hallucination, accuracy, and omission in ChatGPT content scoring varied widely between decision-making factors. Compared to ChatGPT 3.5, average accuracy rates increased while omission rates decreased with ChatGPT 4.0 content analysis output.

CONCLUSIONS

ChatGPT offers an innovative approach to enhancing the quality of online medical information and expanding the capabilities of plastic surgery research and practice. Automation of content analysis is limited by ChatGPT 3.5's high omission rates and ChatGPT 4.0's high hallucination rates. Our results also underscore the importance of iterative prompt design to optimize ChatGPT performance in research tasks.

摘要

简介

在线健康信息的质量和可读性有时不尽如人意,降低了其对患者的有用性。手动评估在线医疗信息既耗时又容易出错。本研究使用 ChatGPT 实现了私人整形外科网页内容分析和可读性改进的自动化。

方法

筛选了“乳房植入物大小因素”和“乳房植入物大小决策”的前 70 个谷歌搜索结果。使用两个提示(1:一般,2:具体)来自动化内容分析和重写具有改进可读性的网页,利用 ChatGPT3.5 和 4.0。使用人工评分作为基准,将 ChatGPT 内容分析输出分为幻觉(假阳性)、准确(真阳性或真阴性)或遗漏(假阴性)。比较原始和修订网页文本的六个可读性指标分数。

结果

共纳入 75 个网页。使用 ChatGPT 3.5 的特定指令提示,在六个可读性指标分数上均显著提高(均 P≤0.05)。使用 ChatGPT 4.0 并不能进一步提高可读性评分。在决策因素方面,ChatGPT 内容评分中的幻觉、准确性和遗漏率差异很大。与 ChatGPT 3.5 相比,使用 ChatGPT 4.0 内容分析输出时,平均准确率提高,遗漏率降低。

结论

ChatGPT 为提高在线医疗信息质量提供了一种创新方法,并扩展了整形外科学研究和实践的能力。内容分析的自动化受到 ChatGPT 3.5 高遗漏率和 ChatGPT 4.0 高幻觉率的限制。我们的结果还强调了迭代提示设计对于优化 ChatGPT 在研究任务中的性能的重要性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验