文献检索，用中文搜 PubMed

BACKGROUND

Artificial intelligence (AI) chatbots, including ChatGPT-4 (GPT-4) and Grok-1 (Grok), have been shown to be potentially useful in several medical fields, but have not been examined in plastic and aesthetic surgery. The aim of this study is to evaluate the responses of these AI chatbots for clinical questions (CQs) related to the guidelines for implant-based breast reconstruction (IBBR) published by the Japan Society of Plastic and Reconstructive Surgery (JSPRS) in 2021.

METHODS

CQs in the JSPRS guidelines were used as question sources. Responses from two AI chatbots, GPT-4 and Grok, were evaluated for accuracy, informativeness, and readability by five Japanese Board-certified breast reconstruction specialists and five Japanese clinical fellows of plastic surgery.

RESULTS

GPT-4 outperformed Grok significantly in terms of accuracy (p < 0.001), informativeness (p < 0.001), and readability (p < 0.001) when evaluated by plastic surgery fellows. Compared to the original guidelines, Grok scored significantly lower in all three areas (all p < 0.001). The accuracy of GPT-4 was rated to be significantly higher based on scores given by plastic surgery fellows compared to those of breast reconstruction specialists (p = 0.012), whereas there was no significant difference between these scores for Grok.

CONCLUSIONS

The study suggests that GPT-4 has the potential to assist in interpreting and applying clinical guidelines for IBBR but importantly there is still a risk that AI chatbots can misinform. Further studies are needed to understand the broader role of current and future AI chatbots in breast reconstruction surgery.

LEVEL OF EVIDENCE IV

This journal requires that authors assign a level of evidence to each article. For a full description of these Evidence-Based Medicine Ratings, please refer to Table of Contents or online Instructions to Authors www.springer.com/00266 .

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

LEVEL OF EVIDENCE IV

背景

包括ChatGPT-4（GPT-4）和Grok-1（Grok）在内的人工智能（AI）聊天机器人已被证明在多个医学领域可能有用，但尚未在整形与美容外科领域接受检验。本研究的目的是评估这些AI聊天机器人对与日本整形重建外科学会（JSPRS）2021年发布的基于植入物的乳房重建（IBBR）指南相关的临床问题（CQs）的回答。

方法

将JSPRS指南中的CQs用作问题来源。由五名日本获得乳房重建专科认证的专家和五名日本整形外科临床住院医师对GPT-4和Grok这两个AI聊天机器人的回答进行准确性、信息量和可读性评估。

结果

在整形外科住院医师的评估中，GPT-4在准确性（p<0.001）、信息量（p<0.001）和可读性（p<0.001）方面均显著优于Grok。与原始指南相比，Grok在所有三个方面的得分均显著较低（所有p<0.001）。根据整形外科住院医师给出的分数，GPT-4的准确性得分显著高于乳房重建专科医生给出的分数（p = 0.012），而Grok的这些分数之间没有显著差异。

结论

该研究表明，GPT-4有潜力协助解读和应用IBBR的临床指南，但重要的是，AI聊天机器人仍存在提供错误信息的风险。需要进一步研究以了解当前和未来的AI聊天机器人在乳房重建手术中的更广泛作用。

证据等级IV：本期刊要求作者为每篇文章指定证据等级。有关这些循证医学评级的完整描述，请参阅目录或作者在线指南www.springer.com/00266 。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

人工智能聊天机器人在回答基于日本乳房植入重建实用指南的临床问题中的表现

Performance of Artificial Intelligence Chatbots in Answering Clinical Questions on Japanese Practical Guidelines for Implant-based Breast Reconstruction.

作者信息

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

LEVEL OF EVIDENCE IV

相似文献

引用本文的文献

本文引用的文献

人工智能聊天机器人在回答基于日本乳房植入重建实用指南的临床问题中的表现

Performance of Artificial Intelligence Chatbots in Answering Clinical Questions on Japanese Practical Guidelines for Implant-based Breast Reconstruction.

作者信息

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

LEVEL OF EVIDENCE IV

背景

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献