ChatGPT对《2022年日本动脉粥样硬化协会动脉粥样硬化性心血管疾病预防指南》中临床问题的回答。

ChatGPT Responses to Clinical Questions in the Japan Atherosclerosis Society Guidelines for Prevention of Atherosclerotic Cardiovascular Disease 2022.

作者信息

Hisamatsu Takashi, Fukuda Mari, Kinuta Minako, Kanda Hideyuki

机构信息

Department of Public Health, Okayama University Graduate School of Medicine, Dentistry and Pharmaceutical Sciences.

出版信息

J Atheroscler Thromb. 2025 May 1;32(5):567-579. doi: 10.5551/jat.65240. Epub 2024 Oct 30.

DOI:10.5551/jat.65240

PMID:39477517

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12055503/

Abstract

AIMS

Artificial intelligence is increasingly used in the medical field. We assessed the accuracy and reproducibility of responses by ChatGPT to clinical questions (CQs) in the Japan Atherosclerosis Society Guidelines for Prevention Atherosclerotic Cardiovascular Diseases 2022 (JAS Guidelines 2022).

METHODS

In June 2024, we assessed responses by ChatGPT (version 3.5) to CQs, including background questions (BQs) and foreground questions (FQs). Accuracy was assessed independently by three researchers using six-point Likert scales ranging from 1 ("completely incorrect") to 6 ("completely correct") by evaluating responses to CQs in Japanese or translated into English. For reproducibility assessment, responses to each CQ asked five times separately in a new chat were scored using six-point Likert scales, and Fleiss kappa coefficients were calculated.

RESULTS

The median (25th-75th percentile) score for ChatGPT's responses to BQs and FQs was 4 (3-5) and 5 (5-6) for Japanese CQs and 5 (3-6) and 6 (5-6) for English CQs, respectively. Response scores were higher for FQs than those for BQs (P values ＜0.001 for Japanese and English). Similar response accuracy levels were observed between Japanese and English CQs (P value 0.139 for BQs and 0.586 for FQs). Kappa coefficients for reproducibility were 0.76 for BQs and 0.90 for FQs.

CONCLUSIONS

ChatGPT showed high accuracy and reproducibility in responding to JAS Guidelines 2022 CQs, especially FQs. While ChatGPT primarily reflects existing guidelines, its strength could lie in rapidly organizing and presenting relevant information, thus supporting instant and more efficient guideline interpretation and aiding in medical decision-making.

摘要

目的

人工智能在医学领域的应用日益广泛。我们评估了ChatGPT对《日本动脉粥样硬化学会2022年动脉粥样硬化性心血管疾病预防指南》（《JAS指南2022》）中临床问题（CQs）的回答的准确性和可重复性。

方法

2024年6月，我们评估了ChatGPT（版本3.5）对CQs的回答，包括背景问题（BQs）和前景问题（FQs）。由三名研究人员独立评估准确性，使用从1（“完全错误”）到6（“完全正确”）的六点李克特量表，通过评估日语或翻译成英语的CQs的回答来进行。为了进行可重复性评估，在新的聊天中分别对每个CQ问五次的回答使用六点李克特量表进行评分，并计算Fleiss卡帕系数。

结果

ChatGPT对日语CQs的BQs和FQs回答的中位数（第25-75百分位数）分数分别为4（3-5）和5（5-6），对英语CQs的回答分别为5（3-6）和6（5-6）。FQs的回答分数高于BQs（日语和英语的P值均＜0.001）。日语和英语CQs之间观察到相似的回答准确性水平（BQs的P值为0.139，FQs的P值为0.586）。BQs的可重复性卡帕系数为0.76，FQs的为0.90。

结论

ChatGPT在回答《JAS指南2022》的CQs方面表现出较高准确性和可重复性，尤其是FQs。虽然ChatGPT主要反映现有指南，但其优势可能在于快速整理和呈现相关信息，从而支持即时且更高效的指南解读，并辅助医疗决策。

相似文献

ChatGPT Responses to Clinical Questions in the Japan Atherosclerosis Society Guidelines for Prevention of Atherosclerotic Cardiovascular Disease 2022.ChatGPT对《2022年日本动脉粥样硬化协会动脉粥样硬化性心血管疾病预防指南》中临床问题的回答。

J Atheroscler Thromb. 2025 May 1;32(5):567-579. doi: 10.5551/jat.65240. Epub 2024 Oct 30.

Performance of ChatGPT in Answering Clinical Questions on the Practical Guideline of Blepharoptosis.ChatGPT在回答关于上睑下垂实用指南临床问题方面的表现。

Aesthetic Plast Surg. 2024 Jul;48(13):2389-2398. doi: 10.1007/s00266-024-04005-1. Epub 2024 Apr 29.

Assessing the Quality and Reliability of ChatGPT's Responses to Radiotherapy-Related Patient Queries: Comparative Study With GPT-3.5 and GPT-4.评估ChatGPT对放疗相关患者问题回答的质量和可靠性：与GPT-3.5和GPT-4的比较研究

JMIR Cancer. 2025 Apr 16;11:e63677. doi: 10.2196/63677.

Performance of Artificial Intelligence Chatbots in Answering Clinical Questions on Japanese Practical Guidelines for Implant-based Breast Reconstruction.人工智能聊天机器人在回答基于日本乳房植入重建实用指南的临床问题中的表现

Aesthetic Plast Surg. 2025 Apr;49(7):1947-1953. doi: 10.1007/s00266-024-04515-y. Epub 2024 Nov 26.

Evaluation of the Accuracy of ChatGPT in Answering Clinical Questions on the Japanese Society of Hypertension Guidelines.评估 ChatGPT 在回答日本高血压学会指南相关临床问题上的准确性。

Circ J. 2023 Jun 23;87(7):1030-1033. doi: 10.1253/circj.CJ-23-0308. Epub 2023 Jun 7.

The Japanese Breast Cancer Society clinical practice guidelines for surgical treatment of breast cancer, 2018 edition.日本乳腺癌学会乳腺癌外科治疗临床实践指南，2018 年版。

Breast Cancer. 2020 Jan;27(1):4-8. doi: 10.1007/s12282-019-01030-w. Epub 2019 Dec 12.

ChatGPT versus NASS clinical guidelines for degenerative spondylolisthesis: a comparative analysis.ChatGPT 与 NASS 退行性脊柱滑脱临床指南比较分析。

Eur Spine J. 2024 Nov;33(11):4182-4203. doi: 10.1007/s00586-024-08198-6. Epub 2024 Mar 15.

Is ChatGPT knowledgeable of acute coronary syndromes and pertinent European Society of Cardiology Guidelines?ChatGPT 是否了解急性冠状动脉综合征和相关的欧洲心脏病学会指南？

Minerva Cardiol Angiol. 2024 Jun;72(3):299-303. doi: 10.23736/S2724-5683.24.06517-7. Epub 2024 Feb 23.

Accuracy of ChatGPT on Medical Questions in the National Medical Licensing Examination in Japan: Evaluation Study.ChatGPT在日本国家医师资格考试医学问题上的准确性：评估研究

JMIR Form Res. 2023 Oct 13;7:e48023. doi: 10.2196/48023.

Comparative evaluation of ChatGPT-4, ChatGPT-3.5 and Google Gemini on PCOS assessment and management based on recommendations from the 2023 guideline.基于2023年指南建议对ChatGPT-4、ChatGPT-3.5和谷歌Gemini在多囊卵巢综合征评估与管理方面的比较评估

Endocrine. 2025 Apr;88(1):315-322. doi: 10.1007/s12020-024-04121-7. Epub 2024 Dec 2.

引用本文的文献

Evaluation of a retrieval-augmented generation system using a Japanese Institutional Nuclear Medicine Manual and large language model-automated scoring.使用日本机构核医学手册和大语言模型自动评分对检索增强生成系统进行评估。

Radiol Phys Technol. 2025 Jul 19. doi: 10.1007/s12194-025-00941-y.

Can Large Language Models Help Healthcare?大语言模型能助力医疗保健吗？

J Atheroscler Thromb. 2025 May 1;32(5):560-562. doi: 10.5551/jat.ED273. Epub 2024 Nov 26.

本文引用的文献

Accuracy and Repeatability of ChatGPT Based on a Set of Multiple-Choice Questions on Objective Tests of Hearing.基于一组听力客观测试多项选择题的ChatGPT的准确性和可重复性。

Cureus. 2024 May 8;16(5):e59857. doi: 10.7759/cureus.59857. eCollection 2024 May.

ChatGPT for Tinnitus Information and Support: Response Accuracy and Retest after Three and Six Months.用于耳鸣信息与支持的ChatGPT：三个月和六个月后的回答准确性及重新测试

Brain Sci. 2024 May 7;14(5):465. doi: 10.3390/brainsci14050465.

Comparing ChatGPT and Bing, in response to the Home Blood Pressure Monitoring (HBPM) knowledge checklist.比较ChatGPT和必应针对家庭血压监测（HBPM）知识清单的回答。

Hypertens Res. 2024 May;47(5):1401-1409. doi: 10.1038/s41440-024-01624-8. Epub 2024 Mar 4.

Japan Atherosclerosis Society (JAS) Guidelines for Prevention of Atherosclerotic Cardiovascular Diseases 2022.日本动脉粥样硬化协会（JAS）2022年动脉粥样硬化性心血管疾病预防指南。

J Atheroscler Thromb. 2024 Jun 1;31(6):641-853. doi: 10.5551/jat.GL2022. Epub 2023 Dec 19.

New Frontiers in Health Literacy: Using ChatGPT to Simplify Health Information for People in the Community.健康素养新前沿：利用 ChatGPT 简化社区人群的健康信息。

J Gen Intern Med. 2024 Mar;39(4):573-577. doi: 10.1007/s11606-023-08469-w. Epub 2023 Nov 8.

Relevance of ChatGPT's Responses to Common Hypertension-Related Patient Inquiries.ChatGPT对常见高血压相关患者咨询的回复的相关性。

Hypertension. 2024 Jan;81(1):e1-e4. doi: 10.1161/HYPERTENSIONAHA.123.22084. Epub 2023 Nov 2.

A Comprehensive Survey of ChatGPT: Advancements, Applications, Prospects, and Challenges.ChatGPT综合调查：进展、应用、前景与挑战

Meta Radiol. 2023 Sep;1(2). doi: 10.1016/j.metrad.2023.100022. Epub 2023 Oct 7.

ChatGPT: promise and challenges for deployment in low- and middle-income countries.ChatGPT：在低收入和中等收入国家部署的前景与挑战。

Lancet Reg Health West Pac. 2023 Sep 15;41:100905. doi: 10.1016/j.lanwpc.2023.100905. eCollection 2023 Dec.

Circ J. 2023 Jun 23;87(7):1030-1033. doi: 10.1253/circj.CJ-23-0308. Epub 2023 Jun 7.

Attention is not all you need: the complicated case of ethically using large language models in healthcare and medicine.注意力并非全部所需：在医疗保健和医学中使用大型语言模型所涉及的复杂伦理问题。

EBioMedicine. 2023 Apr;90:104512. doi: 10.1016/j.ebiom.2023.104512. Epub 2023 Mar 15.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。