Suppr超能文献

ChatGPT 和自然语言人工智能模型在理解检验医学结果方面的潜力和陷阱。欧洲临床化学和检验医学联合会(EFLM)人工智能工作组(WG-AI)的评估。

Potentials and pitfalls of ChatGPT and natural-language artificial intelligence models for the understanding of laboratory medicine test results. An assessment by the European Federation of Clinical Chemistry and Laboratory Medicine (EFLM) Working Group on Artificial Intelligence (WG-AI).

机构信息

Department of Laboratory Medicine, Paracelsus Medical University Salzburg, Salzburg, Austria.

DISCo, Università degli Studi di Milano-Bicocca, Milano, Italy.

出版信息

Clin Chem Lab Med. 2023 Apr 24;61(7):1158-1166. doi: 10.1515/cclm-2023-0355. Print 2023 Jun 27.

Abstract

OBJECTIVES

ChatGPT, a tool based on natural language processing (NLP), is on everyone's mind, and several potential applications in healthcare have been already proposed. However, since the ability of this tool to interpret laboratory test results has not yet been tested, the EFLM Working group on Artificial Intelligence (WG-AI) has set itself the task of closing this gap with a systematic approach.

METHODS

WG-AI members generated 10 simulated laboratory reports of common parameters, which were then passed to ChatGPT for interpretation, according to reference intervals (RI) and units, using an optimized prompt. The results were subsequently evaluated independently by all WG-AI members with respect to relevance, correctness, helpfulness and safety.

RESULTS

ChatGPT recognized all laboratory tests, it could detect if they deviated from the RI and gave a test-by-test as well as an overall interpretation. The interpretations were rather superficial, not always correct, and, only in some cases, judged coherently. The magnitude of the deviation from the RI seldom plays a role in the interpretation of laboratory tests, and artificial intelligence (AI) did not make any meaningful suggestion regarding follow-up diagnostics or further procedures in general.

CONCLUSIONS

ChatGPT in its current form, being not specifically trained on medical data or laboratory data in particular, may only be considered a tool capable of interpreting a laboratory report on a test-by-test basis at best, but not on the interpretation of an overall diagnostic picture. Future generations of similar AIs with medical ground truth training data might surely revolutionize current processes in healthcare, despite this implementation is not ready yet.

摘要

目的

ChatGPT 是一种基于自然语言处理(NLP)的工具,它是每个人都在关注的话题,并且已经提出了几种在医疗保健中潜在的应用。然而,由于该工具解释实验室检测结果的能力尚未经过测试,EFLM 人工智能工作组(WG-AI)已经设定了用系统方法来弥补这一差距的任务。

方法

WG-AI 成员生成了 10 个常见参数的模拟实验室报告,然后根据参考区间(RI)和单位,使用优化后的提示将其传递给 ChatGPT 进行解释。随后,所有 WG-AI 成员独立评估结果的相关性、正确性、有用性和安全性。

结果

ChatGPT 识别了所有的实验室检测,它可以检测到它们是否偏离了 RI,并进行了逐个测试以及整体解释。这些解释相当肤浅,并不总是正确的,而且在某些情况下只是判断一致。从 RI 偏离的幅度在实验室检测的解释中很少起作用,人工智能(AI)通常不会对后续诊断或进一步的程序提出任何有意义的建议。

结论

目前形式的 ChatGPT,由于没有特别针对医疗数据或实验室数据进行训练,因此只能被认为是一种能够逐个测试地解释实验室报告的工具,但不能解释整体诊断情况。具有医学真实数据训练的类似人工智能的新一代产品肯定会彻底改变当前的医疗保健流程,尽管这种实现还不成熟。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验