• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一个大型的带注释的药物错误事故报告数据集。

A large dataset of annotated incident reports on medication errors.

机构信息

Graduate School of Public Health, St. Luke's International University, 3-6-2 Tsukiji, Chuo-ku, Tokyo, 104-0045, Japan.

School of Medical Sciences, The University of Sydney, Camperdown, NSW, 2006, Australia.

出版信息

Sci Data. 2024 Feb 29;11(1):260. doi: 10.1038/s41597-024-03036-2.

DOI:10.1038/s41597-024-03036-2
PMID:38424103
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10904777/
Abstract

Incident reports of medication errors are valuable learning resources for improving patient safety. However, pertinent information is often contained within unstructured free text, which prevents automated analysis and limits the usefulness of these data. Natural language processing can structure this free text automatically and retrieve relevant past incidents and learning materials, but to be able to do so requires a large, fully annotated and validated corpus of incident reports. We present a corpus of 58,658 machine-annotated incident reports of medication errors that can be used to advance the development of information extraction models and subsequent incident learning. We report the best F1-scores for the annotated dataset: 0.97 and 0.76 for named entity recognition and intention/factuality analysis, respectively, for the cross-validation exercise. Our dataset contains 478,175 named entities and differentiates between incident types by recognising discrepancies between what was intended and what actually occurred. We explain our annotation workflow and technical validation and provide access to the validation datasets and machine annotator for labelling future incident reports of medication errors.

摘要

药物错误事件报告是提高患者安全性的宝贵学习资源。然而,相关信息通常包含在非结构化的自由文本中,这阻止了自动化分析并限制了这些数据的有用性。自然语言处理可以自动构建这段自由文本,并检索相关的过去事件和学习材料,但要做到这一点,需要一个大型的、完全标注和验证的药物错误事件报告语料库。我们提供了一个包含 58658 个机器标注的药物错误事件报告的语料库,可用于推进信息提取模型和后续事件学习的发展。我们报告了在交叉验证练习中注释数据集的最佳 F1 分数:命名实体识别为 0.97,意图/事实分析为 0.76。我们的数据集包含 478175 个命名实体,并通过识别预期与实际发生之间的差异来区分事件类型。我们解释了我们的注释工作流程和技术验证,并提供了对验证数据集和机器标注器的访问权限,以标注未来的药物错误事件报告。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0d15/10904777/f591ef1beba5/41597_2024_3036_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0d15/10904777/4ccc5fc0d946/41597_2024_3036_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0d15/10904777/0c9862a7e47c/41597_2024_3036_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0d15/10904777/a1b7d1bd3bac/41597_2024_3036_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0d15/10904777/2e971ccba4e5/41597_2024_3036_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0d15/10904777/be2d4a6722e4/41597_2024_3036_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0d15/10904777/5e8d7079d487/41597_2024_3036_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0d15/10904777/f591ef1beba5/41597_2024_3036_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0d15/10904777/4ccc5fc0d946/41597_2024_3036_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0d15/10904777/0c9862a7e47c/41597_2024_3036_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0d15/10904777/a1b7d1bd3bac/41597_2024_3036_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0d15/10904777/2e971ccba4e5/41597_2024_3036_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0d15/10904777/be2d4a6722e4/41597_2024_3036_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0d15/10904777/5e8d7079d487/41597_2024_3036_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0d15/10904777/f591ef1beba5/41597_2024_3036_Fig7_HTML.jpg

相似文献

1
A large dataset of annotated incident reports on medication errors.一个大型的带注释的药物错误事故报告数据集。
Sci Data. 2024 Feb 29;11(1):260. doi: 10.1038/s41597-024-03036-2.
2
Task definition, annotated dataset, and supervised natural language processing models for symptom extraction from unstructured clinical notes.从非结构化临床记录中提取症状的任务定义、标注数据集和监督自然语言处理模型。
J Biomed Inform. 2020 Feb;102:103354. doi: 10.1016/j.jbi.2019.103354. Epub 2019 Dec 12.
3
Rule-Based Natural Language Processing Pipeline to Detect Medication-Related Named Entities: Insights for Transfer Learning.基于规则的自然语言处理管道来检测药物相关命名实体:迁移学习的见解。
Stud Health Technol Inform. 2024 Jan 25;310:584-588. doi: 10.3233/SHTI231032.
4
A Five-Step Workflow to Manually Annotate Unstructured Data into Training Dataset for Natural Language Processing.将非结构化数据手动注释到自然语言处理训练数据集中的五步工作流程。
Stud Health Technol Inform. 2024 Jan 25;310:109-113. doi: 10.3233/SHTI230937.
5
MLM-based typographical error correction of unstructured medical texts for named entity recognition.基于 MLM 的非结构化医疗文本命名实体识别的排版错误校正。
BMC Bioinformatics. 2022 Nov 16;23(1):486. doi: 10.1186/s12859-022-05035-9.
6
A Fine-Tuned Bidirectional Encoder Representations From Transformers Model for Food Named-Entity Recognition: Algorithm Development and Validation.基于 Transformer 的双向编码器表示模型的精细调整在食品命名实体识别中的应用:算法开发与验证。
J Med Internet Res. 2021 Aug 9;23(8):e28229. doi: 10.2196/28229.
7
Construction of a Multi-Label Classifier for Extracting Multiple Incident Factors From Medication Incident Reports in Residential Care Facilities: Natural Language Processing Approach.构建用于从养老机构用药事件报告中提取多个事件因素的多标签分类器:自然语言处理方法
JMIR Med Inform. 2024 Jul 23;12:e58141. doi: 10.2196/58141.
8
An annotated corpus of clinical trial publications supporting schema-based relational information extraction.支持基于模式的关系信息抽取的临床试验文献标注语料库。
J Biomed Semantics. 2022 May 23;13(1):14. doi: 10.1186/s13326-022-00271-7.
9
Information extraction from multi-institutional radiology reports.从多机构放射学报告中提取信息。
Artif Intell Med. 2016 Jan;66:29-39. doi: 10.1016/j.artmed.2015.09.007. Epub 2015 Oct 3.
10
Accelerating the annotation of sparse named entities by dynamic sentence selection.通过动态句子选择加速稀疏命名实体的标注
BMC Bioinformatics. 2008 Nov 19;9 Suppl 11(Suppl 11):S8. doi: 10.1186/1471-2105-9-S11-S8.

引用本文的文献

1
A scoping review of natural language processing in addressing medically inaccurate information: Errors, misinformation, and hallucination.关于自然语言处理在处理医学错误信息方面的范围综述:错误、错误信息和幻觉。
J Biomed Inform. 2025 Jul 22:104866. doi: 10.1016/j.jbi.2025.104866.
2
A scoping review on generative AI and large language models in mitigating medication related harm.关于生成式人工智能和大语言模型在减轻药物相关危害方面的范围综述。
NPJ Digit Med. 2025 Mar 28;8(1):182. doi: 10.1038/s41746-025-01565-7.
3
Electronic Prescribing in the Neonatal Intensive Care Unit: Analysis of Prescribing Errors and Risk Factors.

本文引用的文献

1
Identifying barriers and benefits of patient safety event reporting toward user-centered design.识别患者安全事件报告在以用户为中心的设计方面的障碍和益处。
Saf Health. 2015;1:7. doi: 10.1186/2056-5917-1-7. Epub 2015 Aug 27.
2
Annotation Guidelines for Medication Errors in Incident Reports: Validation Through a Mixed Methods Approach.用药差错事件报告注释指南:混合方法验证。
Stud Health Technol Inform. 2022 Jun 6;290:354-358. doi: 10.3233/SHTI220095.
3
Evaluating resampling methods and structured features to improve fall incident report identification by the severity level.
新生儿重症监护病房的电子处方:处方错误及风险因素分析
J Med Syst. 2025 Feb 18;49(1):26. doi: 10.1007/s10916-025-02161-8.
4
A pathway from fragmentation to interoperability through standards-based enterprise architecture to enhance patient safety.一条通过基于标准的企业架构从碎片化走向互操作性以提高患者安全的路径。
NPJ Digit Med. 2025 Jan 18;8(1):41. doi: 10.1038/s41746-025-01442-3.
评估重采样方法和结构化特征,以按严重程度改进跌倒事件报告识别。
J Am Med Inform Assoc. 2021 Jul 30;28(8):1756-1764. doi: 10.1093/jamia/ocab048.
4
Can Unified Medical Language System-based semantic representation improve automated identification of patient safety incident reports by type and severity?基于统一医学语言系统的语义表示能否提高自动识别患者安全事件报告的类型和严重程度的能力?
J Am Med Inform Assoc. 2020 Oct 1;27(10):1502-1509. doi: 10.1093/jamia/ocaa082.
5
Advancing the state of the art in automatic extraction of adverse drug events from narratives.推进从叙述中自动提取药物不良事件的技术水平。
J Am Med Inform Assoc. 2020 Jan 1;27(1):1-2. doi: 10.1093/jamia/ocz206.
6
Medication-rights detection using incident reports: A natural language processing and deep neural network approach.使用事件报告进行用药权检测:一种自然语言处理和深度神经网络方法。
Health Informatics J. 2020 Sep;26(3):1777-1794. doi: 10.1177/1460458219889798. Epub 2019 Dec 10.
7
Classification Scheme for Incident Reports of Medication Errors.用药错误事件报告分类方案。
Stud Health Technol Inform. 2019 Aug 9;265:113-118. doi: 10.3233/SHTI190148.
8
Natural Language Processing and Its Implications for the Future of Medication Safety: A Narrative Review of Recent Advances and Challenges.自然语言处理及其对药物安全未来的影响:对近期进展和挑战的叙述性综述。
Pharmacotherapy. 2018 Aug;38(8):822-841. doi: 10.1002/phar.2151. Epub 2018 Jul 22.
9
Nature of Blame in Patient Safety Incident Reports: Mixed Methods Analysis of a National Database.患者安全事件报告中的归咎性质:对国家数据库的混合方法分析
Ann Fam Med. 2017 Sep;15(5):455-461. doi: 10.1370/afm.2123.
10
Using multiclass classification to automate the identification of patient safety incident reports by type and severity.使用多类分类法按类型和严重程度自动识别患者安全事件报告。
BMC Med Inform Decis Mak. 2017 Jun 12;17(1):84. doi: 10.1186/s12911-017-0483-8.