评估ChatGPT回答哮喘相关问题的准确性。

Evaluation of the accuracy of ChatGPT in answering asthma-related questions.

作者信息

Cerqueira Bruno Pellozo, Leite Vinicius Cappellette da Silva, França Carla Gonzaga, Leitão Filho Fernando Sergio, Faresin Sonia Maria, Figueiredo Ricardo Gassmann, Cetlin Andrea Antunes, Caetano Lilian Serrasqueiro Ballini, Baddini-Martinez José

机构信息

. Escola Paulista de Medicina, Universidade Federal de São Paulo, São Paulo (SP) Brasil.

. Divisão de Pneumologia, Escola Paulista de Medicina, Universidade Federal de São Paulo, São Paulo (SP) Brasil.

出版信息

J Bras Pneumol. 2025 Sep 8;51(3):e20240388. doi: 10.36416/1806-3756/e20240388. eCollection 2025.

DOI:10.36416/1806-3756/e20240388

PMID:40929479

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12401147/

Abstract

OBJECTIVE

To evaluate the quality of ChatGPT answers to asthma-related questions, as assessed from the perspectives of asthma specialists and laypersons.

METHODS

Seven asthma-related questions were asked to ChatGPT (version 4) between May 3, 2024 and May 4, 2024. The questions were standardized with no memory of previous conversations to avoid bias. Six pulmonologists with extensive expertise in asthma acted as judges, independently assessing the quality and reproducibility of the answers from the perspectives of asthma specialists and laypersons. A Likert scale ranging from 1 to 4 was used, and the content validity coefficient was calculated to assess the level of agreement among the judges.

RESULTS

The evaluations showed variability in the quality of the answers provided by ChatGPT. From the perspective of asthma specialists, the scores ranged from 2 to 3, with greater divergence in questions 2, 3, and 5. From the perspective of laypersons, the content validity coefficient exceeded 0.80 for four of the seven questions, with most answers being correct despite a lack of significant depth.

CONCLUSIONS

Although ChatGPT performed well in providing answers to laypersons, the answers that it provided to specialists were less accurate and superficial. Although AI has the potential to provide useful information to the public, it should not replace medical guidance. Critical analysis of AI-generated information remains essential for health care professionals and laypersons alike, especially for complex conditions such as asthma.

摘要

目的

从哮喘专家和非专业人士的角度评估ChatGPT对哮喘相关问题的回答质量。

方法

在2024年5月3日至2024年5月4日期间，向ChatGPT（版本4）提出了7个与哮喘相关的问题。问题进行了标准化处理，且不保留之前对话的记忆以避免偏差。六位在哮喘方面具有丰富专业知识的肺科医生担任评判员，分别从哮喘专家和非专业人士的角度独立评估回答的质量和可重复性。使用了1至4的李克特量表，并计算内容效度系数以评估评判员之间的一致程度。

结果

评估显示ChatGPT提供的回答质量存在差异。从哮喘专家的角度来看，分数在2至3之间，问题2、3和5的差异更大。从非专业人士的角度来看，七个问题中有四个的内容效度系数超过0.80，尽管回答缺乏深度，但大多数答案是正确的。

结论

尽管ChatGPT在向非专业人士提供回答方面表现良好，但它向专家提供的回答不够准确且较为肤浅。虽然人工智能有潜力向公众提供有用信息，但它不应取代医疗指导。对人工智能生成的信息进行批判性分析对医疗保健专业人员和非专业人士都至关重要，尤其是对于哮喘等复杂病症。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9518/12401147/18d008140da4/1806-3756-jbpneu-51-03-e20240388-gf1.jpg

相似文献

Evaluation of the accuracy of ChatGPT in answering asthma-related questions.评估ChatGPT回答哮喘相关问题的准确性。

J Bras Pneumol. 2025 Sep 8;51(3):e20240388. doi: 10.36416/1806-3756/e20240388. eCollection 2025.

ChatGPT-4.0 or DeepSeek-V3? Comparative analysis of answers to the most frequently asked questions by total knee replacement candidate patients.ChatGPT-4.0还是DeepSeek-V3？全膝关节置换候选患者常见问题答案的比较分析。

Medicine (Baltimore). 2025 Aug 22;104(34):e43951. doi: 10.1097/MD.0000000000043951.

Using Artificial Intelligence ChatGPT to Access Medical Information About Chemical Eye Injuries: Comparative Study.使用人工智能ChatGPT获取有关化学性眼外伤的医学信息：比较研究

JMIR Form Res. 2025 Aug 13;9:e73642. doi: 10.2196/73642.

Prescription of Controlled Substances: Benefits and Risks管制药品的处方：益处与风险

Artificial Intelligence in Peripheral Artery Disease Education: A Battle Between ChatGPT and Google Gemini.外周动脉疾病教育中的人工智能：ChatGPT与谷歌Gemini的较量

Cureus. 2025 Jun 1;17(6):e85174. doi: 10.7759/cureus.85174. eCollection 2025 Jun.

Assessing ChatGPT's Educational Potential in Lung Cancer Radiotherapy From Clinician and Patient Perspectives: Content Quality and Readability Analysis.从临床医生和患者角度评估ChatGPT在肺癌放疗中的教育潜力：内容质量与可读性分析

JMIR Cancer. 2025 Aug 13;11:e69783. doi: 10.2196/69783.

Evaluating ChatGPT's Utility in Biologic Therapy for Systemic Lupus Erythematosus: Comparative Study of ChatGPT and Google Web Search.评估ChatGPT在系统性红斑狼疮生物治疗中的效用：ChatGPT与谷歌网络搜索的比较研究

JMIR Form Res. 2025 Aug 28;9:e76458. doi: 10.2196/76458.

Artificial Intelligence Chatbots in Pediatric Emergencies: A Reliable Lifeline or a Risk?儿科急诊中的人工智能聊天机器人：可靠的生命线还是风险？

Cureus. 2025 Aug 1;17(8):e89234. doi: 10.7759/cureus.89234. eCollection 2025 Aug.

AI in Medical Questionnaires: Innovations, Diagnosis, and Implications.医学问卷中的人工智能：创新、诊断及影响

J Med Internet Res. 2025 Jun 23;27:e72398. doi: 10.2196/72398.

Comparison of ChatGPT and Internet Research for Clinical Research and Decision-Making in Occupational Medicine: Randomized Controlled Trial.ChatGPT与互联网搜索用于职业医学临床研究和决策的比较：随机对照试验

JMIR Form Res. 2025 May 20;9:e63857. doi: 10.2196/63857.

本文引用的文献

Quality of Chatbot Information Related to Benign Prostatic Hyperplasia.与良性前列腺增生相关的聊天机器人信息质量

Prostate. 2025 Feb;85(2):175-180. doi: 10.1002/pros.24814. Epub 2024 Nov 8.

Evaluating the scientific reliability of ChatGPT as a source of information on asthma.评估ChatGPT作为哮喘信息来源的科学可靠性。

J Allergy Clin Immunol Glob. 2024 Aug 28;3(4):100330. doi: 10.1016/j.jacig.2024.100330. eCollection 2024 Nov.

ChatGPT in medicine: an overview of its applications, advantages, limitations, future prospects, and ethical considerations.医学领域的ChatGPT：其应用、优势、局限性、未来前景及伦理考量概述

Front Artif Intell. 2023 May 4;6:1169595. doi: 10.3389/frai.2023.1169595. eCollection 2023.

User Intentions to Use ChatGPT for Self-Diagnosis and Health-Related Purposes: Cross-sectional Survey Study.用户使用ChatGPT进行自我诊断及与健康相关目的的意图：横断面调查研究。

JMIR Hum Factors. 2023 May 17;10:e47564. doi: 10.2196/47564.

Comparing Physician and Artificial Intelligence Chatbot Responses to Patient Questions Posted to a Public Social Media Forum.比较医生和人工智能聊天机器人对发布在公共社交媒体论坛上的患者问题的回复。

JAMA Intern Med. 2023 Jun 1;183(6):589-596. doi: 10.1001/jamainternmed.2023.1838.

Assessing the performance of ChatGPT in answering questions regarding cirrhosis and hepatocellular carcinoma.评估 ChatGPT 在回答肝硬化和肝细胞癌相关问题方面的表现。

Clin Mol Hepatol. 2023 Jul;29(3):721-732. doi: 10.3350/cmh.2023.0089. Epub 2023 Mar 22.

Heterogeneity within and between physician-diagnosed asthma and/or COPD: NOVELTY cohort.医生诊断的哮喘和/或慢性阻塞性肺疾病内部及之间的异质性：新型队列研究。

Eur Respir J. 2021 Sep 23;58(3). doi: 10.1183/13993003.03927-2020. Print 2021 Sep.

History of artificial intelligence in medicine.医学人工智能的历史。

Gastrointest Endosc. 2020 Oct;92(4):807-812. doi: 10.1016/j.gie.2020.06.040. Epub 2020 Jun 18.

Asthma in adults: Principles of treatment.成人哮喘：治疗原则。

Allergy Asthma Proc. 2019 Nov 1;40(6):396-402. doi: 10.2500/aap.2019.40.4256.

Asthma.哮喘。

Lancet. 2018 Feb 24;391(10122):783-800. doi: 10.1016/S0140-6736(17)33311-1. Epub 2017 Dec 19.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

评估ChatGPT回答哮喘相关问题的准确性。

Evaluation of the accuracy of ChatGPT in answering asthma-related questions.

作者信息

机构信息

出版信息

OBJECTIVE

METHODS

RESULTS

CONCLUSIONS

目的

方法

结果

结论

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献