• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

临床记录中的共指分析:一种带有交替回指解析模块的多遍筛选方法。

Coreference analysis in clinical notes: a multi-pass sieve with alternate anaphora resolution modules.

机构信息

Department of Health Sciences Research, Mayo Clinic, Rochester, Minnesota 55905, USA.

出版信息

J Am Med Inform Assoc. 2012 Sep-Oct;19(5):867-74. doi: 10.1136/amiajnl-2011-000766. Epub 2012 Jun 16.

DOI:10.1136/amiajnl-2011-000766
PMID:22707745
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3422831/
Abstract

OBJECTIVE

This paper describes the coreference resolution system submitted by Mayo Clinic for the 2011 i2b2/VA/Cincinnati shared task Track 1C. The goal of the task was to construct a system that links the markables corresponding to the same entity.

MATERIALS AND METHODS

The task organizers provided progress notes and discharge summaries that were annotated with the markables of treatment, problem, test, person, and pronoun. We used a multi-pass sieve algorithm that applies deterministic rules in the order of preciseness and simultaneously gathers information about the entities in the documents. Our system, MedCoref, also uses a state-of-the-art machine learning framework as an alternative to the final, rule-based pronoun resolution sieve.

RESULTS

The best system that uses a multi-pass sieve has an overall score of 0.836 (average of B(3), MUC, Blanc, and CEAF F score) for the training set and 0.843 for the test set.

DISCUSSION

A supervised machine learning system that typically uses a single function to find coreferents cannot accommodate irregularities encountered in data especially given the insufficient number of examples. On the other hand, a completely deterministic system could lead to a decrease in recall (sensitivity) when the rules are not exhaustive. The sieve-based framework allows one to combine reliable machine learning components with rules designed by experts.

CONCLUSION

Using relatively simple rules, part-of-speech information, and semantic type properties, an effective coreference resolution system could be designed. The source code of the system described is available at https://sourceforge.net/projects/ohnlp/files/MedCoref.

摘要

目的

本文介绍 Mayo 诊所为 2011 年 i2b2/VA/Cincinnati 共享任务第 1C 轨道提交的共指消解系统。该任务的目标是构建一个能够将对应于同一实体的可标记项链接起来的系统。

材料与方法

任务组织者提供了带有治疗、问题、测试、人员和代词标记项的进展记录和出院小结。我们使用了一种多步筛选算法,该算法按照精确性的顺序应用确定性规则,同时收集文档中实体的信息。我们的系统 MedCoref 还使用了最先进的机器学习框架作为最终基于规则的代词消解筛选器的替代方案。

结果

使用多步筛选器的最佳系统在训练集上的总分为 0.836(B(3)、MUC、Blanc 和 CEAF F 分数的平均值),在测试集上的总分为 0.843。

讨论

一个典型地使用单一函数来寻找共指项的监督机器学习系统无法适应数据中的不规则性,尤其是在示例数量不足的情况下。另一方面,完全确定性的系统可能会导致规则不详尽时召回率(灵敏度)下降。基于筛选器的框架允许将可靠的机器学习组件与专家设计的规则结合起来。

结论

使用相对简单的规则、词性信息和语义类型属性,可以设计出有效的共指消解系统。描述的系统的源代码可在 https://sourceforge.net/projects/ohnlp/files/MedCoref 上获得。

相似文献

1
Coreference analysis in clinical notes: a multi-pass sieve with alternate anaphora resolution modules.临床记录中的共指分析:一种带有交替回指解析模块的多遍筛选方法。
J Am Med Inform Assoc. 2012 Sep-Oct;19(5):867-74. doi: 10.1136/amiajnl-2011-000766. Epub 2012 Jun 16.
2
A supervised framework for resolving coreference in clinical records.一种用于解决临床记录中共指消解问题的有监督框架。
J Am Med Inform Assoc. 2012 Sep-Oct;19(5):875-82. doi: 10.1136/amiajnl-2012-000810. Epub 2012 May 19.
3
A classification approach to coreference in discharge summaries: 2011 i2b2 challenge.一种用于出院小结中核心参照的分类方法:2011 i2b2 挑战赛。
J Am Med Inform Assoc. 2012 Sep-Oct;19(5):897-905. doi: 10.1136/amiajnl-2011-000734. Epub 2012 Apr 13.
4
Coreference resolution of medical concepts in discharge summaries by exploiting contextual information.利用上下文信息解决出院小结中医疗概念的共指消解问题。
J Am Med Inform Assoc. 2012 Sep-Oct;19(5):888-96. doi: 10.1136/amiajnl-2012-000808. Epub 2012 May 3.
5
A study of machine-learning-based approaches to extract clinical entities and their assertions from discharge summaries.基于机器学习的方法从出院小结中提取临床实体及其断言的研究。
J Am Med Inform Assoc. 2011 Sep-Oct;18(5):601-6. doi: 10.1136/amiajnl-2011-000163. Epub 2011 Apr 20.
6
A categorical analysis of coreference resolution errors in biomedical texts.生物医学文本中指代消解错误的分类分析。
J Biomed Inform. 2016 Apr;60:309-18. doi: 10.1016/j.jbi.2016.02.015. Epub 2016 Feb 27.
7
MCORES: a system for noun phrase coreference resolution for clinical records.MCORES:用于临床记录中名词短语共指消解的系统。
J Am Med Inform Assoc. 2012 Sep-Oct;19(5):906-12. doi: 10.1136/amiajnl-2011-000591. Epub 2012 Mar 14.
8
Using domain knowledge and domain-inspired discourse model for coreference resolution for clinical narratives.利用领域知识和领域启发的语篇模型解决临床叙述中的共指消解问题。
J Am Med Inform Assoc. 2013 Mar-Apr;20(2):356-62. doi: 10.1136/amiajnl-2011-000767. Epub 2012 Jul 10.
9
Machine learning-based coreference resolution of concepts in clinical documents.基于机器学习的临床文档中概念的共指消解。
J Am Med Inform Assoc. 2012 Sep-Oct;19(5):883-7. doi: 10.1136/amiajnl-2011-000774. Epub 2012 May 12.
10
A system for coreference resolution for the clinical narrative.临床叙述的共指消解系统。
J Am Med Inform Assoc. 2012 Jul-Aug;19(4):660-7. doi: 10.1136/amiajnl-2011-000599. Epub 2012 Jan 31.

引用本文的文献

1
The UAB Informatics Institute and 2016 CEGS N-GRID de-identification shared task challenge.UAB 信息学研究所和 2016 年 CEGS N-GRID 去识别共享任务挑战赛。
J Biomed Inform. 2017 Nov;75S:S54-S61. doi: 10.1016/j.jbi.2017.05.001. Epub 2017 May 3.
2
Towards generalizable entity-centric clinical coreference resolution.迈向可泛化的以实体为中心的临床共指消解
J Biomed Inform. 2017 May;69:251-258. doi: 10.1016/j.jbi.2017.04.015. Epub 2017 Apr 21.
3
Automated annotation and classification of BI-RADS assessment from radiology reports.从放射学报告中自动标注和分类乳腺影像报告和数据系统(BI-RADS)评估结果
J Biomed Inform. 2017 May;69:177-187. doi: 10.1016/j.jbi.2017.04.011. Epub 2017 Apr 18.
4
An Infinite Mixture Model for Coreference Resolution in Clinical Notes.临床笔记中指代消解的无限混合模型。
AMIA Jt Summits Transl Sci Proc. 2016 Jul 22;2016:428-37. eCollection 2016.
5
PIPE: a protein-protein interaction passage extraction module for BioCreative challenge.PIPE:用于生物创意挑战的蛋白质-蛋白质相互作用通路提取模块
Database (Oxford). 2016 Aug 14;2016. doi: 10.1093/database/baw101. Print 2016.
6
Toward a Learning Health-care System - Knowledge Delivery at the Point of Care Empowered by Big Data and NLP.迈向学习型医疗保健系统——由大数据和自然语言处理驱动的医疗现场知识传递
Biomed Inform Insights. 2016 Jun 23;8(Suppl 1):13-22. doi: 10.4137/BII.S37977. eCollection 2016.
7
A Natural Language Processing Tool for Large-Scale Data Extraction from Echocardiography Reports.一种用于从超声心动图报告中大规模提取数据的自然语言处理工具。
PLoS One. 2016 Apr 28;11(4):e0153749. doi: 10.1371/journal.pone.0153749. eCollection 2016.
8
PDF text classification to leverage information extraction from publication reports.利用出版物报告中的信息提取进行PDF文本分类。
J Biomed Inform. 2016 Jun;61:141-8. doi: 10.1016/j.jbi.2016.03.026. Epub 2016 Apr 1.
9
Bio-SCoRes: A Smorgasbord Architecture for Coreference Resolution in Biomedical Text.生物共指消解评分系统(Bio-SCoRes):一种用于生物医学文本共指消解的混合架构
PLoS One. 2016 Mar 2;11(3):e0148538. doi: 10.1371/journal.pone.0148538. eCollection 2016.
10
An automatic system to identify heart disease risk factors in clinical texts over time.一个用于长期识别临床文本中心脏病风险因素的自动系统。
J Biomed Inform. 2015 Dec;58 Suppl(Suppl):S158-S163. doi: 10.1016/j.jbi.2015.09.002. Epub 2015 Sep 8.

本文引用的文献

1
Feasibility of pooling annotated corpora for clinical concept extraction.用于临床概念提取的标注语料库合并的可行性。
AMIA Jt Summits Transl Sci Proc. 2012;2012:38. Epub 2012 Mar 19.
2
Evaluating the state of the art in coreference resolution for electronic medical records.评估电子病历中核心参考解析的最新技术水平。
J Am Med Inform Assoc. 2012 Sep-Oct;19(5):786-91. doi: 10.1136/amiajnl-2011-000784. Epub 2012 Feb 24.
3
Anaphoric reference in clinical reports: characteristics of an annotated corpus.临床报告中的照应关系:标注语料库的特点。
J Biomed Inform. 2012 Jun;45(3):507-21. doi: 10.1016/j.jbi.2012.01.010. Epub 2012 Feb 9.
4
Document clustering of clinical narratives: a systematic study of clinical sublanguages.临床叙述的文档聚类:临床子语言的系统研究
AMIA Annu Symp Proc. 2011;2011:1099-107. Epub 2011 Oct 22.
5
Part-of-speech tagging for clinical text: wall or bridge between institutions?临床文本的词性标注:机构之间的壁垒还是桥梁?
AMIA Annu Symp Proc. 2011;2011:382-91. Epub 2011 Oct 22.
6
Enhancing clinical concept extraction with distributional semantics.利用分布语义增强临床概念提取。
J Biomed Inform. 2012 Feb;45(1):129-40. doi: 10.1016/j.jbi.2011.10.007. Epub 2011 Nov 7.
7
Drug side effect extraction from clinical narratives of psychiatry and psychology patients.从精神病学和心理学患者的临床叙述中提取药物副作用。
J Am Med Inform Assoc. 2011 Dec;18 Suppl 1(Suppl 1):i144-9. doi: 10.1136/amiajnl-2011-000351. Epub 2011 Sep 21.
8
Coreference resolution: a review of general methodologies and applications in the clinical domain.共指消解:综述临床领域的通用方法及应用。
J Biomed Inform. 2011 Dec;44(6):1113-22. doi: 10.1016/j.jbi.2011.08.006. Epub 2011 Aug 12.
9
Comparing methods for identifying pancreatic cancer patients using electronic data sources.比较使用电子数据源识别胰腺癌患者的方法。
AMIA Annu Symp Proc. 2010 Nov 13;2010:237-41.
10
NEMO: Extraction and normalization of organization names from PubMed affiliations.尼莫:从PubMed机构附属信息中提取并规范组织名称。
J Biomed Discov Collab. 2010 Oct 4;5:50-75.