Suppr超能文献

通过ChatGPT增强围产期健康患者信息——一项准确性研究。

Enhancing perinatal health patient information through ChatGPT - An accuracy study.

作者信息

de Vries P L M, Baud D, Baggio S, Ceulemans M, Favre G, Gerbier E, Legardeur H, Maisonneuve E, Pena-Reyes C, Pomar L, Winterfeld U, Panchaud A

机构信息

Department of Gynecology and Obstetrics, Lausanne University Hospital and University of Lausanne, Lausanne, Switzerland.

Institute of Primary Health Care (BIHAM), University of Bern, Bern, Switzerland.

出版信息

PEC Innov. 2025 Feb 10;6:100381. doi: 10.1016/j.pecinn.2025.100381. eCollection 2025 Jun.

Abstract

OBJECTIVES

To evaluate ChatGPT's accuracy as information source for women and maternity-care workers on "nutrition" and "red flags" in pregnancy.

METHODS

Accuracy of ChatGPT-generated recommendations was assessed by a 5-point Likert scale by eight raters for ten indicators per topic in four languages (French, English, German and Dutch). Accuracy and interrater agreement were calculated per topic and language.

RESULTS

For both topics, median accuracy scores of ChatGPT-generated recommendations were excellent (5.0; IQR 4-5) independently of language. Median accuracy scores varied with a maximum of 1 on a 5-point Likert-scare according to question's framing. Overall accuracy scores were 83-89 % for 'nutrition in pregnancy' versus 96-98 % for 'red flags in pregnancy'. Inter-rater agreement was good to excellent for both topics.

CONCLUSION

Although ChatGPT generated accurate recommendations regarding the tested indicators for nutrition and red flags during pregnancy, women should be aware of ChatGPT's limitations such as inconsistencies according to formulation, language and the woman's personal context.

INNOVATION

Despite a growing interest in the potential use of artificial intelligence in healthcare, this is, to the best of our knowledge, the first study assessing potential limitations that may impact accuracy of ChatGPT-generated recommendations such as language and question-framing in key domains of perinatal health.

摘要

目的

评估ChatGPT作为女性及孕产护理人员获取孕期“营养”和“危险信号”信息来源的准确性。

方法

由八位评分者使用5级李克特量表,对ChatGPT生成的建议在四种语言(法语、英语、德语和荷兰语)下每个主题的十个指标进行准确性评估。计算每个主题和语言的准确性及评分者间一致性。

结果

对于两个主题,ChatGPT生成建议的中位数准确性得分均为优秀(5.0;四分位距4 - 5),与语言无关。根据问题的框架,中位数准确性得分在5级李克特量表上最多相差1分。“孕期营养”的总体准确性得分是83 - 89%,而“孕期危险信号”为96 - 98%。两个主题的评分者间一致性均为良好到优秀。

结论

尽管ChatGPT针对孕期营养和危险信号的测试指标生成了准确的建议,但女性应意识到ChatGPT的局限性,如因表述、语言和女性个人情况而产生的不一致性。

创新点

尽管人们对人工智能在医疗保健中的潜在应用兴趣日益浓厚,但据我们所知,这是第一项评估可能影响ChatGPT生成建议准确性的潜在局限性的研究,这些局限性包括围产期健康关键领域中的语言和问题框架。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2be8/11872132/bea46f9b6eed/gr1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验