Suppr超能文献

使用具有隐私保护功能的机器学习系统破译临床缩写。

Deciphering clinical abbreviations with a privacy protecting machine learning system.

机构信息

Google, Mountain View, CA, USA.

出版信息

Nat Commun. 2022 Dec 2;13(1):7456. doi: 10.1038/s41467-022-35007-9.

Abstract

Physicians write clinical notes with abbreviations and shorthand that are difficult to decipher. Abbreviations can be clinical jargon (writing "HIT" for "heparin induced thrombocytopenia"), ambiguous terms that require expertise to disambiguate (using "MS" for "multiple sclerosis" or "mental status"), or domain-specific vernacular ("cb" for "complicated by"). Here we train machine learning models on public web data to decode such text by replacing abbreviations with their meanings. We report a single translation model that simultaneously detects and expands thousands of abbreviations in real clinical notes with accuracies ranging from 92.1%-97.1% on multiple external test datasets. The model equals or exceeds the performance of board-certified physicians (97.6% vs 88.7% total accuracy). Our results demonstrate a general method to contextually decipher abbreviations and shorthand that is built without any privacy-compromising data.

摘要

医生在写临床笔记时会使用缩写和简写,这些缩写和简写很难辨认。缩写可以是临床术语(将“肝素诱导的血小板减少症”缩写为“HIT”),也可以是需要专业知识才能消除歧义的模糊术语(将“多发性硬化症”或“精神状态”缩写为“MS”),或者是特定领域的行话(将“cb”缩写为“complicated by”)。在这里,我们在公共网络数据上训练机器学习模型,通过用含义替换缩写来对这种文本进行解码。我们报告了一个单一的翻译模型,该模型可以同时检测和扩展真实临床记录中的数千个缩写,在多个外部测试数据集上的准确率范围从 92.1%到 97.1%。该模型的表现与董事会认证医生相当(总准确率为 97.6%,而 88.7%)。我们的结果展示了一种上下文推断缩写和简写的通用方法,该方法是在不损害任何隐私数据的情况下构建的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/62f4/9718734/99f6cfb1e5a3/41467_2022_35007_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验