从人工智能角度看老年综合征：评估ChatGPT的信息准确性和可读性。

An artificial intelligence perspective on geriatric syndromes: assessing the information accuracy and readability of ChatGPT.

作者信息

Efendioglu Eyyup Murat, Cigiloglu Ahmet

机构信息

Department of Internal Medicine, Division of Geriatric Medicine, Gaziantep City Hospital, Gaziantep, Turkey.

Department of Internal Medicine, Division of Geriatric Medicine, Kahramanmaraş Necip Fazıl City Hospital, 46050, Dulkadiroglu, Kahmaranmaraş, Turkey.

出版信息

Eur Geriatr Med. 2025 Apr 21. doi: 10.1007/s41999-025-01202-2.

DOI:10.1007/s41999-025-01202-2

PMID:40257746

Abstract

PURPOSE

ChatGPT, a comprehensive language processing model, provides the opportunity for supportive and professional interactions with patients. However, its use to address patients' frequently asked questions (FAQs) and the readability of the text generated by ChatGPT remain unexplored, particularly in geriatrics. We identified the FAQs about common geriatric syndromes and assessed the accuracy and readability of the responses provided by ChatGPT.

METHODS

Two geriatricians with extensive knowledge and experience in geriatric syndromes independently reviewed the 28 responses provided by ChatGPT. The accuracy of the responses generated by ChatGPT was categorized on a rating scale from 0 (harmful) to 4 (excellent) based on current guidelines and approaches. The readability of the text generated by ChatGPT was assessed by administering two tests: the Flesch-Kincaid Reading Ease (FKRE) and the Flesch-Kincaid Grade Level (FKGL).

RESULTS

ChatGPT-generated responses with an overall mean accuracy score of 88% (3.52/4). Responses generated for sarcopenia diagnosis and depression treatment in older adults had the lowest accuracy scores (2.0 and 2.5, respectively). The mean FKRE score of the texts was 25.2, while the mean FKGL score was 14.5.

CONCLUSION

The accuracy scores of the responses generated by ChatGPT were high in most common geriatric syndromes except for sarcopenia diagnosis and depression treatment. Moreover, the text generated by ChatGPT was very difficult to read and best understood by college graduates. ChatGPT may reduce the uncertainty many patients face. Nevertheless, it remains advisable to consult with subject matter experts when undertaking consequential decision-making.

摘要

目的

ChatGPT是一个综合语言处理模型，为与患者进行支持性和专业性互动提供了机会。然而，其用于解答患者常见问题（FAQ）以及ChatGPT生成文本的可读性仍未得到探索，尤其是在老年医学领域。我们确定了关于常见老年综合征的常见问题，并评估了ChatGPT提供的回答的准确性和可读性。

方法

两位在老年综合征方面具有丰富知识和经验的老年病专家独立审查了ChatGPT提供的28个回答。根据当前指南和方法，将ChatGPT生成回答的准确性按从0（有害）到4（优秀）的评分量表进行分类。通过进行两项测试来评估ChatGPT生成文本的可读性：弗莱施-金凯德易读性（FKRE）和弗莱施-金凯德年级水平（FKGL）。

结果

ChatGPT生成的回答总体平均准确率为88%（3.52/4）。针对老年人肌肉减少症诊断和抑郁症治疗生成的回答准确率得分最低（分别为2.0和2.5）。文本的平均FKRE得分为25.2，而平均FKGL得分为14.5。

结论

除了肌肉减少症诊断和抑郁症治疗外，ChatGPT生成的回答在大多数常见老年综合征中的准确率得分较高。此外，ChatGPT生成的文本非常难读，大学毕业生才能最好地理解。ChatGPT可能会减少许多患者面临的不确定性。然而，在进行重要决策时，咨询主题专家仍然是明智的。

相似文献

An artificial intelligence perspective on geriatric syndromes: assessing the information accuracy and readability of ChatGPT.

Eur Geriatr Med. 2025 Apr 21. doi: 10.1007/s41999-025-01202-2.

Dr. Google to Dr. ChatGPT: assessing the content and quality of artificial intelligence-generated medical information on appendicitis.

Surg Endosc. 2024 May;38(5):2887-2893. doi: 10.1007/s00464-024-10739-5. Epub 2024 Mar 5.

Assessing the Quality and Reliability of ChatGPT's Responses to Radiotherapy-Related Patient Queries: Comparative Study With GPT-3.5 and GPT-4.

JMIR Cancer. 2025 Apr 16;11:e63677. doi: 10.2196/63677.

Artificial intelligence insights into osteoporosis: assessing ChatGPT's information quality and readability.

Arch Osteoporos. 2024 Mar 19;19(1):17. doi: 10.1007/s11657-024-01376-5.

American academy of Orthopedic Surgeons' OrthoInfo provides more readable information regarding meniscus injury than ChatGPT-4 while information accuracy is comparable.

J ISAKOS. 2025 Apr;11:100843. doi: 10.1016/j.jisako.2025.100843. Epub 2025 Feb 21.

Readability, reliability and quality of responses generated by ChatGPT, gemini, and perplexity for the most frequently asked questions about pain.

Medicine (Baltimore). 2025 Mar 14;104(11):e41780. doi: 10.1097/MD.0000000000041780.

Improving readability in AI-generated medical information on fragility fractures: the role of prompt wording on ChatGPT's responses.

Osteoporos Int. 2025 Mar;36(3):403-410. doi: 10.1007/s00198-024-07358-0. Epub 2025 Jan 8.

Performance of Artificial Intelligence Chatbots in Responding to Patient Queries Related to Traumatic Dental Injuries: A Comparative Study.

Dent Traumatol. 2025 Jun;41(3):338-347. doi: 10.1111/edt.13020. Epub 2024 Nov 22.

Readability, accuracy and appropriateness and quality of AI chatbot responses as a patient information source on root canal retreatment: A comparative assessment.

Int J Med Inform. 2025 Sep;201:105948. doi: 10.1016/j.ijmedinf.2025.105948. Epub 2025 Apr 25.

Evaluating AI-generated patient education materials for spinal surgeries: Comparative analysis of readability and DISCERN quality across ChatGPT and deepseek models.

Int J Med Inform. 2025 Jun;198:105871. doi: 10.1016/j.ijmedinf.2025.105871. Epub 2025 Mar 13.

本文引用的文献

ChatGPT's Attitude, Knowledge, and Clinical Application in Geriatrics Practice and Education: Exploratory Observational Study.

JMIR Form Res. 2025 Jan 3;9:e63494. doi: 10.2196/63494.

Custom GPTs Enhancing Performance and Evidence Compared with GPT-3.5, GPT-4, and GPT-4o? A Study on the Emergency Medicine Specialist Examination.

Healthcare (Basel). 2024 Aug 30;12(17):1726. doi: 10.3390/healthcare12171726.

Geriatrics and artificial intelligence in Spain (Ger-IA project): talking to ChatGPT, a nationwide survey.

Eur Geriatr Med. 2024 Aug;15(4):1129-1136. doi: 10.1007/s41999-024-00970-7. Epub 2024 Apr 14.

New Horizons in artificial intelligence in the healthcare of older people.

Age Ageing. 2023 Dec 1;52(12). doi: 10.1093/ageing/afad219.

Identifying depression and its determinants upon initiating treatment: ChatGPT versus primary care physicians.

Fam Med Community Health. 2023 Sep;11(4). doi: 10.1136/fmch-2023-002391.

Large language models in medicine.

Nat Med. 2023 Aug;29(8):1930-1940. doi: 10.1038/s41591-023-02448-8. Epub 2023 Jul 17.

Artificial Intelligence-Based Clinical Decision Support Systems in Geriatrics: An Ethical Analysis.

J Am Med Dir Assoc. 2023 Sep;24(9):1271-1276.e4. doi: 10.1016/j.jamda.2023.06.008. Epub 2023 Jul 12.

STOPP/START criteria for potentially inappropriate prescribing in older people: version 3.

Eur Geriatr Med. 2023 Aug;14(4):625-632. doi: 10.1007/s41999-023-00777-y. Epub 2023 May 31.

Artificial intelligence and geriatric medicine: New possibilities and consequences.

J Am Geriatr Soc. 2023 Jun;71(6):2028-2031. doi: 10.1111/jgs.18334. Epub 2023 Mar 17.

Global prevalence of depression in older adults: A systematic review and meta-analysis of epidemiological surveys.

Asian J Psychiatr. 2023 Feb;80:103417. doi: 10.1016/j.ajp.2022.103417. Epub 2022 Dec 20.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

从人工智能角度看老年综合征：评估ChatGPT的信息准确性和可读性。

An artificial intelligence perspective on geriatric syndromes: assessing the information accuracy and readability of ChatGPT.

作者信息

机构信息

出版信息

PURPOSE

METHODS

RESULTS

CONCLUSION

目的

方法

结果

结论

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献