Suppr超能文献

在重大事件报告系统(CIRS)中使用生成式人工智能/自然语言处理来识别和分析重大事件的潜力:一项可行性病例对照研究。

The Potential of Using Generative AI/NLP to Identify and Analyse Critical Incidents in a Critical Incident Reporting System (CIRS): A Feasibility Case-Control Study.

作者信息

Hölzing Carlos Ramon, Rumpf Sebastian, Huber Stephan, Papenfuß Nathalie, Meybohm Patrick, Happel Oliver

机构信息

Department of Anaesthesiology, Intensive Care, Emergency and Pain Medicine, University Hospital Würzburg, Oberdürrbacher Str. 6, 97080 Würzburg, Germany.

Psychological Ergonomics, University of Würzburg, 97070 Würzburg, Germany.

出版信息

Healthcare (Basel). 2024 Oct 2;12(19):1964. doi: 10.3390/healthcare12191964.

Abstract

BACKGROUND

To enhance patient safety in healthcare, it is crucial to address the underreporting of issues in Critical Incident Reporting Systems (CIRSs). This study aims to evaluate the effectiveness of generative Artificial Intelligence and Natural Language Processing (AI/NLP) in reviewing CIRS cases by comparing its performance with human reviewers and categorising these cases into relevant topics.

METHODS

A case-control feasibility study was conducted using CIRS cases from the German CIRS-Anaesthesiology subsystem. Each case was reviewed by a human expert and by an AI/NLP model (ChatGPT-3.5). Two CIRS experts blindly assessed these reviews, rating them on linguistic quality, recognisable expertise, logical derivability, and overall quality using six-point Likert scales.

RESULTS

On average, the CIRS experts correctly classified 80% of human CIRS reviews as created by a human and misclassified 45.8% of AI reviews as written by a human. Ratings on a scale of 1 (very good) to 6 (failed) revealed a comparable performance between human- and AI-generated reviews across the dimensions of linguistic expression ( = 0.39), recognisable expertise ( = 0.89), logical derivability ( = 0.84), and overall quality ( = 0.87). The AI model was able to categorise the cases into relevant topics independently.

CONCLUSIONS

This feasibility study demonstrates the potential of generative AI/NLP in analysing and categorising cases from the CIRS. This could have implications for improving incident reporting in healthcare. Therefore, additional research is required to verify and expand upon these discoveries.

摘要

背景

为提高医疗保健中的患者安全,解决关键事件报告系统(CIRSs)中问题报告不足的问题至关重要。本研究旨在通过将生成式人工智能和自然语言处理(AI/NLP)的性能与人工评审员进行比较,并将这些案例分类到相关主题中,来评估其在审查CIRS案例方面的有效性。

方法

使用来自德国CIRS麻醉学子系统的CIRS案例进行了一项病例对照可行性研究。每个案例由一名人类专家和一个AI/NLP模型(ChatGPT-3.5)进行审查。两名CIRS专家对这些审查进行了盲评,使用六点李克特量表对语言质量、可识别的专业知识、逻辑推导能力和整体质量进行评分。

结果

平均而言,CIRS专家正确地将80%的人工CIRS审查归类为由人类创建,而将45.8%的AI审查错误地归类为由人类撰写。在从1(非常好)到6(不合格)的评分中,人工生成的审查和AI生成的审查在语言表达(=0.39)、可识别的专业知识(=0.89)、逻辑推导能力(=0.84)和整体质量(=0.87)方面表现相当。AI模型能够独立地将案例分类到相关主题中。

结论

这项可行性研究证明了生成式AI/NLP在分析和分类CIRS案例方面的潜力。这可能对改善医疗保健中的事件报告有影响。因此,需要进一步的研究来验证和扩展这些发现。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7e10/11475821/289e8fa06f68/healthcare-12-01964-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验