• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

评估ChatGPT在系统性红斑狼疮生物治疗中的效用:ChatGPT与谷歌网络搜索的比较研究

Evaluating ChatGPT's Utility in Biologic Therapy for Systemic Lupus Erythematosus: Comparative Study of ChatGPT and Google Web Search.

作者信息

Li Kai, Peng Yunfei, Li Luyi, Liu Bo, Huang Zhijian

机构信息

School of Journalism and Communication, Guangxi University, No.100, Daxue East Road, Nanning, 530004, China, 86 13367611322.

Department of Rheumatology and Immunology, The Second Affiliated Hospital of Guangxi Medical University, Nanning, China.

出版信息

JMIR Form Res. 2025 Aug 28;9:e76458. doi: 10.2196/76458.

DOI:10.2196/76458
PMID:40878064
Abstract

BACKGROUND

Systemic lupus erythematosus (SLE) is a life-threatening, multisystem autoimmune disease. Biologic therapy is a promising treatment for SLE. However, public understanding of this therapy is still insufficient, and the quality of related information on the internet varies, which affects patients' acceptance of this treatment. The effectiveness of artificial intelligence technologies, such as ChatGPT (OpenAI), in knowledge dissemination within the health care field has attracted significant attention. Research on ChatGPT's utility in answering questions regarding biologic therapy for SLE could promote the dissemination of this treatment.

OBJECTIVE

This study aimed to evaluate ChatGPT's utility as a tool for users to obtain health information about biologic therapy for SLE.

METHODS

This study extracted 20 common questions related to biologic therapy for SLE, their corresponding answers, and the sources of these answers from both Google Web Search and ChatGPT-4o (OpenAI). Then, based on Rothwell's classification, the questions were categorized into 3 main types: fact, policy, and value. The sources of the answers were classified into 5 categories: commercial, academic, medical practice, government, and social media. The accuracy and completeness of the answers were assessed using Likert scales. The readability of the answers was evaluated using the Flesch Reading Ease and Flesch-Kincaid Grade Level (FKGL) scores.

RESULTS

The study found that, in terms of question types, ChatGPT-4o had the highest proportion of fact questions (10/20), followed by policy (7/20) and value (3/20). Google Web Search had the highest proportion of fact questions (12/20), followed by value (5/20) and policy (3/20). In terms of website sources, ChatGPT-4o's answers were sourced from 48 sources, with the majority coming from academic sources (29/48). Google Web Search provided answers from 20 sources, with an even distribution across all 5 categories. For accuracy, ChatGPT-4o's mean score of 5.83 (SD 0.49) was higher than that of Google Web Search (mean 4.75, SD 0.94), with a mean difference of 1.08 (95% CI 0.61-1.54). For completeness, ChatGPT-4o's mean score of 2.88 (SD 0.32) was higher than that of Google Web Search (mean 1.68, SD 0.69), with a mean difference of 1.2 (95% CI 0.96-1.44). For readability, the Flesch Reading Ease and Flesch-Kincaid Grade Level scores for ChatGPT-4o and Google Web Search were 11.7 and 14.9, and 16.2 and 20, respectively, indicating that both texts were of high reading difficulty, requiring readers to have a college graduate-level reading proficiency. When asking ChatGPT to respond at a sixth-grade level, the readability of the answers significantly improved.

CONCLUSIONS

ChatGPT's answers are characterized by accuracy, rigor, comprehensiveness, and professional supporting materials, and demonstrate humanistic care. However, the readability of the provided text is low, requiring users to have a college education background. Given the study's limitations in question scope, comparison dimensions, research perspectives, and language types, further in-depth comparative research is recommended.

摘要

背景

系统性红斑狼疮(SLE)是一种危及生命的多系统自身免疫性疾病。生物疗法是治疗SLE的一种有前景的方法。然而,公众对这种疗法的了解仍然不足,互联网上相关信息的质量参差不齐,这影响了患者对该治疗方法的接受度。诸如ChatGPT(OpenAI)等人工智能技术在医疗保健领域知识传播方面的有效性已引起广泛关注。研究ChatGPT在回答有关SLE生物疗法问题方面的效用有助于促进该疗法的传播。

目的

本研究旨在评估ChatGPT作为用户获取SLE生物疗法健康信息工具的效用。

方法

本研究从谷歌网络搜索和ChatGPT-4o(OpenAI)中提取了20个与SLE生物疗法相关的常见问题、其相应答案以及这些答案的来源。然后,根据罗斯韦尔分类法,将问题分为3种主要类型:事实、政策和价值观。答案来源分为5类:商业、学术、医疗实践、政府和社交媒体。使用李克特量表评估答案的准确性和完整性。使用弗莱什易读性和弗莱什 - 金凯德年级水平(FKGL)分数评估答案的可读性。

结果

研究发现,就问题类型而言,ChatGPT-4o的事实类问题比例最高(10/20),其次是政策类(7/20)和价值观类(3/20)。谷歌网络搜索的事实类问题比例最高(12/20),其次是价值观类(5/20)和政策类(3/20)。在网站来源方面,ChatGPT-4o的答案来自48个来源,大部分来自学术来源(29/48)。谷歌网络搜索提供了20个来源的答案,在所有5个类别中分布均匀。在准确性方面,ChatGPT-4o的平均得分为5.83(标准差0.49),高于谷歌网络搜索(平均4.75,标准差0.94),平均差异为1.08(95%置信区间0.61 - 1.54)。在完整性方面,ChatGPT-4o的平均得分为2.88(标准差0.32),高于谷歌网络搜索(平均1.68,标准差0.69),平均差异为1.2(95%置信区间0.96 - 1.44)。在可读性方面,ChatGPT-4o和谷歌网络搜索的弗莱什易读性分数分别为11.7和14.9,弗莱什 - 金凯德年级水平分数分别为16.2和20,表明两篇文本的阅读难度都很高,要求读者具备大学毕业水平的阅读能力。当要求ChatGPT以六年级水平回复时,答案的可读性显著提高。

结论

ChatGPT的答案具有准确性、严谨性、全面性和专业支撑材料,并体现了人文关怀。然而,所提供文本的可读性较低,要求用户具有大学教育背景。鉴于本研究在问题范围、比较维度研究视角和语言类型方面存在局限性,建议进行进一步深入的比较研究。

相似文献

1
Evaluating ChatGPT's Utility in Biologic Therapy for Systemic Lupus Erythematosus: Comparative Study of ChatGPT and Google Web Search.评估ChatGPT在系统性红斑狼疮生物治疗中的效用:ChatGPT与谷歌网络搜索的比较研究
JMIR Form Res. 2025 Aug 28;9:e76458. doi: 10.2196/76458.
2
Using Artificial Intelligence ChatGPT to Access Medical Information About Chemical Eye Injuries: Comparative Study.使用人工智能ChatGPT获取有关化学性眼外伤的医学信息:比较研究
JMIR Form Res. 2025 Aug 13;9:e73642. doi: 10.2196/73642.
3
Is Information About Musculoskeletal Malignancies From Large Language Models or Web Resources at a Suitable Reading Level for Patients?来自大语言模型或网络资源的关于肌肉骨骼恶性肿瘤的信息对患者来说是否处于合适的阅读水平?
Clin Orthop Relat Res. 2025 Feb 1;483(2):306-315. doi: 10.1097/CORR.0000000000003263. Epub 2024 Sep 25.
4
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
5
Assessing ChatGPT's Educational Potential in Lung Cancer Radiotherapy From Clinician and Patient Perspectives: Content Quality and Readability Analysis.从临床医生和患者角度评估ChatGPT在肺癌放疗中的教育潜力:内容质量与可读性分析
JMIR Cancer. 2025 Aug 13;11:e69783. doi: 10.2196/69783.
6
Comparison of Responses from ChatGPT-4, Google Gemini, and Google Search to Common Patient Questions About Ankle Sprains: A Readability Analysis.ChatGPT-4、谷歌Gemini和谷歌搜索对关于脚踝扭伤的常见患者问题的回答比较:可读性分析
J Am Acad Orthop Surg. 2025 Jul 3;33(16):924-930. doi: 10.5435/JAAOS-D-25-00260.
7
Evaluation of Patient Education Materials From Large-Language Artificial Intelligence Models on Carpal Tunnel Release.基于大语言人工智能模型的腕管松解术患者教育材料评估
Hand (N Y). 2024 Apr 25:15589447241247332. doi: 10.1177/15589447241247332.
8
Artificial Intelligence in Peripheral Artery Disease Education: A Battle Between ChatGPT and Google Gemini.外周动脉疾病教育中的人工智能:ChatGPT与谷歌Gemini的较量
Cureus. 2025 Jun 1;17(6):e85174. doi: 10.7759/cureus.85174. eCollection 2025 Jun.
9
Evaluation of ChatGPT-4 as an Online Outpatient Assistant in Puerperal Mastitis Management: Content Analysis of an Observational Study.评估ChatGPT-4作为产褥期乳腺炎管理在线门诊助手的效果:一项观察性研究的内容分析
JMIR Med Inform. 2025 Jul 24;13:e68980. doi: 10.2196/68980.
10
Evaluating the role of AI chatbots in patient education for abdominal aortic aneurysms: a comparison of ChatGPT and conventional resources.评估人工智能聊天机器人在腹主动脉瘤患者教育中的作用:ChatGPT与传统资源的比较
ANZ J Surg. 2025 Apr;95(4):784-788. doi: 10.1111/ans.70053. Epub 2025 Mar 5.

本文引用的文献

1
Evaluating search engines and large language models for answering health questions.评估用于回答健康问题的搜索引擎和大语言模型。
NPJ Digit Med. 2025 Mar 10;8(1):153. doi: 10.1038/s41746-025-01546-w.
2
Jargon and Readability in Plain Language Summaries of Health Research: Cross-Sectional Observational Study.健康研究简明语言摘要中的术语与可读性:横断面观察性研究
J Med Internet Res. 2025 Jan 13;27:e50862. doi: 10.2196/50862.
3
Evaluation of ChatGPT-4o's answers to questions about hip arthroscopy from the patient perspective.从患者角度评估ChatGPT-4o对髋关节镜检查相关问题的回答。
Jt Dis Relat Surg. 2025 Jan 2;36(1):193-199. doi: 10.52312/jdrs.2025.1961. Epub 2024 Dec 18.
4
Large Language Models and Empathy: Systematic Review.大语言模型与同理心:系统综述
J Med Internet Res. 2024 Dec 11;26:e52597. doi: 10.2196/52597.
5
Evaluation of the Appropriateness and Readability of ChatGPT-4 Responses to Patient Queries on Uveitis.评估ChatGPT-4对葡萄膜炎患者问题的回答的恰当性和可读性。
Ophthalmol Sci. 2024 Aug 8;5(1):100594. doi: 10.1016/j.xops.2024.100594. eCollection 2025 Jan-Feb.
6
World Medical Association Declaration of Helsinki: Ethical Principles for Medical Research Involving Human Participants.《世界医学协会赫尔辛基宣言:涉及人类受试者的医学研究伦理原则》
JAMA. 2025 Jan 7;333(1):71-74. doi: 10.1001/jama.2024.21972.
7
ChatGPT-4 Performs Clinical Information Retrieval Tasks Using Consistently More Trustworthy Resources Than Does Google Search for Queries Concerning the Latarjet Procedure.对于有关拉塔热手术的查询,ChatGPT-4在执行临床信息检索任务时,使用的资源始终比谷歌搜索更可靠。
Arthroscopy. 2025 Mar;41(3):588-597. doi: 10.1016/j.arthro.2024.05.025. Epub 2024 Jun 25.
8
Transforming Virtual Healthcare: The Potentials of ChatGPT-4omni in Telemedicine.变革虚拟医疗:ChatGPT-4omni在远程医疗中的潜力
Cureus. 2024 May 30;16(5):e61377. doi: 10.7759/cureus.61377. eCollection 2024 May.
9
Hallucination Rates and Reference Accuracy of ChatGPT and Bard for Systematic Reviews: Comparative Analysis.幻觉发生率和 ChatGPT 与 Bard 用于系统评价的参考准确性:比较分析。
J Med Internet Res. 2024 May 22;26:e53164. doi: 10.2196/53164.
10
The Performance of Chatbots and the AAPOS Website as a Tool for Amblyopia Education.聊天机器人和 AAPOS 网站在弱视教育中的应用效果。
J Pediatr Ophthalmol Strabismus. 2024 Sep-Oct;61(5):325-331. doi: 10.3928/01913913-20240409-01. Epub 2024 May 30.