Suppr超能文献

与英语相比,评估ChatGPT对乌尔都语中糖尿病相关问题的回答的理解能力和准确性。

Evaluating the comprehension and accuracy of ChatGPT's responses to diabetes-related questions in Urdu compared to English.

作者信息

Faisal Seyreen, Kamran Tafiya Erum, Khalid Rimsha, Haider Zaira, Siddiqui Yusra, Saeed Nadia, Imran Sunaina, Faisal Romaan, Jabeen Misbah

机构信息

Shifa College of Medicine, Shifa Tameer-e-Millat University, Islamabad, Pakistan.

Department of Internal Medicine, Shifa College of Medicine, Shifa Tameer-e-Millat University, Islamabad, Pakistan.

出版信息

Digit Health. 2024 Oct 17;10:20552076241289730. doi: 10.1177/20552076241289730. eCollection 2024 Jan-Dec.

Abstract

INTRODUCTION

Patients with diabetes require healthcare and information that are accurate and extensive. Large language models (LLMs) like ChatGPT herald the capacity to provide such exhaustive data. To determine (a) the comprehensiveness of ChatGPT's responses in Urdu to diabetes-related questions and (b) the accuracy of ChatGPT's Urdu responses when compared to its English responses.

METHODS

A cross-sectional observational study was conducted. Two reviewers experienced in internal medicine and endocrinology graded 53 Urdu and English responses on diabetes knowledge, lifestyle, and prevention. A senior reviewer resolved discrepancies. Responses were assessed for comprehension and accuracy, then compared to English.

RESULTS

Among the Urdu responses generated, only two of 53 (3.8%) questions were graded as comprehensive, and five of 53 (9.4%) were graded as correct but inadequate. We found that 25 of 53 (47.2%) questions were graded as mixed with correct and incorrect/outdated data, the most significant proportion of responses being graded as such. When considering the comparison of response scale grading the comparative accuracy of Urdu and English responses, no Urdu response (0.0%) was considered to have more accuracy than English. Most of the Urdu responses were found to have an accuracy less than that of English, an overwhelming majority of 49 of 53 (92.5%) responses.

CONCLUSION

We found that although the ability to retrieve such information about diabetes is impressive, it can merely be used as an adjunct instead of a solitary source of information. Further work must be done to optimize Urdu responses in medical contexts to approximate the boundless potential it heralds.

摘要

引言

糖尿病患者需要准确且全面的医疗保健服务和信息。像ChatGPT这样的大语言模型有望提供此类详尽的数据。本研究旨在确定:(a)ChatGPT对乌尔都语糖尿病相关问题回答的全面性;(b)ChatGPT乌尔都语回答与英语回答相比的准确性。

方法

开展了一项横断面观察性研究。两名内科和内分泌学领域经验丰富的评审员对53条关于糖尿病知识、生活方式及预防的乌尔都语和英语回答进行评分。由一名资深评审员解决分歧。对回答进行理解和准确性评估,然后与英语回答进行比较。

结果

在生成的乌尔都语回答中,53个问题中只有2个(3.8%)被评为全面,53个问题中有5个(9.4%)被评为正确但不充分。我们发现,53个问题中有25个(47.2%)被评为正确与错误/过时数据混合,这是回答被评为此类的最大比例。在考虑回答量表评分以比较乌尔都语和英语回答的相对准确性时,没有乌尔都语回答(0.0%)被认为比英语回答更准确。大多数乌尔都语回答的准确性低于英语回答,53个回答中有49个(92.5%)的压倒性多数。

结论

我们发现,尽管获取此类糖尿病信息的能力令人印象深刻,但它仅可作为辅助手段,而非唯一的信息来源。必须进一步开展工作,以优化医学背景下的乌尔都语回答,以接近其预示的无限潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9be0/11490976/01b1a270fece/10.1177_20552076241289730-fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验