针对偏头痛患者教育的最先进大型语言模型基准测试：对常见查询的响应性能比较。 - Suppr | 超能文献

相似文献

1

Benchmarking State-of-the-Art Large Language Models for Migraine Patient Education: Performance Comparison of Responses to Common Queries.针对偏头痛患者教育的最先进大型语言模型基准测试：对常见查询的响应性能比较。

J Med Internet Res. 2024 Jul 23;26:e55927. doi: 10.2196/55927.

2

Evaluating Large Language Models for the National Premedical Exam in India: Comparative Analysis of GPT-3.5, GPT-4, and Bard.评估印度全国医预考用大型语言模型：GPT-3.5、GPT-4 和 Bard 的比较分析。

JMIR Med Educ. 2024 Feb 21;10:e51523. doi: 10.2196/51523.

3

Benchmarking large language models' performances for myopia care: a comparative analysis of ChatGPT-3.5, ChatGPT-4.0, and Google Bard.比较分析 ChatGPT-3.5、ChatGPT-4.0 和谷歌巴德在近视防控方面的表现：大型语言模型的基准测试。

EBioMedicine. 2023 Sep;95:104770. doi: 10.1016/j.ebiom.2023.104770. Epub 2023 Aug 23.

4

Capacity of Generative AI to Interpret Human Emotions From Visual and Textual Data: Pilot Evaluation Study.生成式人工智能从视觉和文本数据中解读人类情感的能力：初步评估研究。

JMIR Ment Health. 2024 Feb 6;11:e54369. doi: 10.2196/54369.

5

Generative artificial intelligence chatbots may provide appropriate informational responses to common vascular surgery questions by patients.生成式人工智能聊天机器人可能会为患者关于常见血管外科问题提供恰当的信息性回复。

Vascular. 2025 Feb;33(1):229-237. doi: 10.1177/17085381241240550. Epub 2024 Mar 18.

6

Evaluating capabilities of large language models: Performance of GPT-4 on surgical knowledge assessments.评估大语言模型的能力：GPT-4在外科知识评估中的表现。

Surgery. 2024 Apr;175(4):936-942. doi: 10.1016/j.surg.2023.12.014. Epub 2024 Jan 20.

7

Performance of artificial intelligence in bariatric surgery: comparative analysis of ChatGPT-4, Bing, and Bard in the American Society for Metabolic and Bariatric Surgery textbook of bariatric surgery questions.人工智能在减重手术中的表现：ChatGPT-4、Bing 和 Bard 在《美国代谢与减重外科学会减重手术教科书》减重手术问题中的比较分析。

Surg Obes Relat Dis. 2024 Jul;20(7):609-613. doi: 10.1016/j.soard.2024.04.014. Epub 2024 May 8.

8

The performance of artificial intelligence models in generating responses to general orthodontic questions: ChatGPT vs Google Bard.人工智能模型在生成正畸常见问题回答方面的表现：ChatGPT与谷歌巴德的对比

Am J Orthod Dentofacial Orthop. 2024 Jun;165(6):652-662. doi: 10.1016/j.ajodo.2024.01.012. Epub 2024 Mar 15.

9

Language discrepancies in the performance of generative artificial intelligence models: an examination of infectious disease queries in English and Arabic.生成式人工智能模型在性能方面的语言差异：对英文和阿拉伯文传染病查询的考察。

BMC Infect Dis. 2024 Aug 8;24(1):799. doi: 10.1186/s12879-024-09725-y.

10

Utilizing Artificial Intelligence-Based Tools for Addressing Clinical Queries: ChatGPT Versus Google Gemini.利用基于人工智能的工具解决临床问题：ChatGPT 与 Google Gemini 之比较。

J Nurs Educ. 2024 Aug;63(8):556-559. doi: 10.3928/01484834-20240426-01. Epub 2024 Aug 1.

引用本文的文献

1

Development and Validation of a Large Language Model-Based System for Medical History-Taking Training: Prospective Multicase Study on Evaluation Stability, Human-AI Consistency, and Transparency.基于大语言模型的病史采集训练系统的开发与验证：关于评估稳定性、人机一致性和透明度的前瞻性多案例研究

JMIR Med Educ. 2025 Aug 29;11:e73419. doi: 10.2196/73419.

2

The Emerging Clinical Relevance of Artificial Intelligence, Data Science, and Wearable Devices in Headache: A Narrative Review.人工智能、数据科学和可穿戴设备在头痛领域中新兴的临床相关性：一篇叙述性综述。

Life (Basel). 2025 Jun 4;15(6):909. doi: 10.3390/life15060909.

3

AI in Home Care-Evaluation of Large Language Models for Future Training of Informal Caregivers: Observational Comparative Case Study.

本文引用的文献

1

Benchmarking large language models' performances for myopia care: a comparative analysis of ChatGPT-3.5, ChatGPT-4.0, and Google Bard.比较分析 ChatGPT-3.5、ChatGPT-4.0 和谷歌巴德在近视防控方面的表现：大型语言模型的基准测试。

EBioMedicine. 2023 Sep;95:104770. doi: 10.1016/j.ebiom.2023.104770. Epub 2023 Aug 23.

2

Large language models encode clinical knowledge.大语言模型编码临床知识。

Nature. 2023 Aug;620(7972):172-180. doi: 10.1038/s41586-023-06291-2. Epub 2023 Jul 12.

3

Diagnostic accuracy of an artificial intelligence online engine in migraine: A multi-center study.