Suppr超能文献

通过一对一分类法对临床缩写进行消歧:算法开发和验证研究。

Disambiguating Clinical Abbreviations by One-to-All Classification: Algorithm Development and Validation Study.

机构信息

Division of Neurology, Department of Internal Medicine, Ditmanson Medical Foundation Chia-Yi Christian Hospital, Chiayi City, Taiwan.

Department of Nursing, Fooyin University, Kaohsiung, Taiwan.

出版信息

JMIR Med Inform. 2024 Oct 1;12:e56955. doi: 10.2196/56955.

Abstract

BACKGROUND

Electronic medical records store extensive patient data and serve as a comprehensive repository, including textual medical records like surgical and imaging reports. Their utility in clinical decision support systems is substantial, but the widespread use of ambiguous and unstandardized abbreviations in clinical documents poses challenges for natural language processing in clinical decision support systems. Efficient abbreviation disambiguation methods are needed for effective information extraction.

OBJECTIVE

This study aims to enhance the one-to-all (OTA) framework for clinical abbreviation expansion, which uses a single model to predict multiple abbreviation meanings. The objective is to improve OTA by developing context-candidate pairs and optimizing word embeddings in Bidirectional Encoder Representations From Transformers (BERT), evaluating the model's efficacy in expanding clinical abbreviations using real data.

METHODS

Three datasets were used: Medical Subject Headings Word Sense Disambiguation, University of Minnesota, and Chia-Yi Christian Hospital from Ditmanson Medical Foundation Chia-Yi Christian Hospital. Texts containing polysemous abbreviations were preprocessed and formatted for BERT. The study involved fine-tuning pretrained models, ClinicalBERT and BlueBERT, generating dataset pairs for training and testing based on Huang et al's method.

RESULTS

BlueBERT achieved macro- and microaccuracies of 95.41% and 95.16%, respectively, on the Medical Subject Headings Word Sense Disambiguation dataset. It improved macroaccuracy by 0.54%-1.53% compared to two baselines, long short-term memory and deepBioWSD with random embedding. On the University of Minnesota dataset, BlueBERT recorded macro- and microaccuracies of 98.40% and 98.22%, respectively. Against the baselines of Word2Vec + support vector machine and BioWordVec + support vector machine, BlueBERT demonstrated a macroaccuracy improvement of 2.61%-4.13%.

CONCLUSIONS

This research preliminarily validated the effectiveness of the OTA method for abbreviation disambiguation in medical texts, demonstrating the potential to enhance both clinical staff efficiency and research effectiveness.

摘要

背景

电子病历存储了大量患者数据,是一个综合的知识库,其中包括手术和影像报告等文本医疗记录。它们在临床决策支持系统中的实用性很大,但临床文档中广泛使用模糊和非标准化的缩写给临床决策支持系统中的自然语言处理带来了挑战。需要有效的缩写词消歧方法来进行有效的信息提取。

目的

本研究旨在增强用于临床缩写扩展的一对一到所有(OTA)框架,该框架使用单个模型来预测多个缩写含义。目标是通过开发上下文-候选对并优化来自变压器的双向编码器表示(BERT)中的单词嵌入,使用真实数据评估模型在扩展临床缩写方面的效果,从而改进 OTA。

方法

使用了三个数据集:医学主题词词义消歧、明尼苏达大学和基督教医科大学(由基督教医科大学基金会管理)。含有多义词缩写的文本经过预处理和 BERT 格式化。研究涉及微调预训练模型 ClinicalBERT 和 BlueBERT,根据 Huang 等人的方法生成训练和测试数据集对。

结果

BlueBERT 在医学主题词词义消歧数据集上的宏准确率和微准确率分别达到 95.41%和 95.16%。与两个基线(长短期记忆和带有随机嵌入的 deepBioWSD)相比,它的宏准确率提高了 0.54%-1.53%。在明尼苏达大学数据集上,BlueBERT 的宏准确率和微准确率分别达到 98.40%和 98.22%。与 Word2Vec + 支持向量机和 BioWordVec + 支持向量机基线相比,BlueBERT 的宏准确率提高了 2.61%-4.13%。

结论

本研究初步验证了 OTA 方法在医学文本缩写消歧中的有效性,表明该方法有可能提高临床工作人员的效率和研究效果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9d12/11460304/2c863c6c22fd/medinform-v12-e56955-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验