• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

生物原因:生物医学领域的因果关系标注与分析。

BioCause: Annotating and analysing causality in the biomedical domain.

机构信息

The National Centre for Text Mining, School of Computer Science, The University of Manchester, 131 Princess Street, Manchester M1 7DN, UK.

出版信息

BMC Bioinformatics. 2013 Jan 16;14:2. doi: 10.1186/1471-2105-14-2.

DOI:10.1186/1471-2105-14-2
PMID:23323613
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3621543/
Abstract

BACKGROUND

Biomedical corpora annotated with event-level information represent an important resource for domain-specific information extraction (IE) systems. However, bio-event annotation alone cannot cater for all the needs of biologists. Unlike work on relation and event extraction, most of which focusses on specific events and named entities, we aim to build a comprehensive resource, covering all statements of causal association present in discourse. Causality lies at the heart of biomedical knowledge, such as diagnosis, pathology or systems biology, and, thus, automatic causality recognition can greatly reduce the human workload by suggesting possible causal connections and aiding in the curation of pathway models. A biomedical text corpus annotated with such relations is, hence, crucial for developing and evaluating biomedical text mining.

RESULTS

We have defined an annotation scheme for enriching biomedical domain corpora with causality relations. This schema has subsequently been used to annotate 851 causal relations to form BioCause, a collection of 19 open-access full-text biomedical journal articles belonging to the subdomain of infectious diseases. These documents have been pre-annotated with named entity and event information in the context of previous shared tasks. We report an inter-annotator agreement rate of over 60% for triggers and of over 80% for arguments using an exact match constraint. These increase significantly using a relaxed match setting. Moreover, we analyse and describe the causality relations in BioCause from various points of view. This information can then be leveraged for the training of automatic causality detection systems.

CONCLUSION

Augmenting named entity and event annotations with information about causal discourse relations could benefit the development of more sophisticated IE systems. These will further influence the development of multiple tasks, such as enabling textual inference to detect entailments, discovering new facts and providing new hypotheses for experimental work.

摘要

背景

标注了事件级信息的生物医学语料库是特定领域信息提取(IE)系统的重要资源。然而,仅生物事件标注并不能满足生物学家的所有需求。与关系和事件抽取的工作不同,大多数工作都集中在特定的事件和命名实体上,我们的目标是构建一个全面的资源,涵盖话语中存在的所有因果关联陈述。因果关系是生物医学知识的核心,如诊断、病理学或系统生物学,因此,自动因果关系识别可以通过建议可能的因果联系并帮助管理途径模型,极大地减少人工工作量。因此,标注了此类关系的生物医学文本语料库对于开发和评估生物医学文本挖掘至关重要。

结果

我们已经定义了一种注释方案,用于为生物医学领域语料库添加因果关系。该方案随后被用于标注 851 个因果关系,以形成 BioCause,这是一个由 19 篇开放获取的全文生物医学期刊文章组成的集合,属于传染病子领域。这些文档在之前的共享任务中已经针对命名实体和事件信息进行了预标注。我们使用精确匹配约束报告了触发词的超过 60%的注释者间一致性率和超过 80%的论元一致性率。在使用宽松匹配设置时,这些一致性率显著增加。此外,我们从多个角度分析和描述了 BioCause 中的因果关系。这些信息可用于训练自动因果关系检测系统。

结论

在命名实体和事件标注中添加有关因果话语关系的信息可以使更复杂的 IE 系统受益。这将进一步影响多个任务的发展,例如能够进行文本推理以检测蕴涵、发现新事实并为实验工作提供新假设。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f9e/3621543/14d33e4aba3b/1471-2105-14-2-15.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f9e/3621543/2519c0357de9/1471-2105-14-2-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f9e/3621543/1bf4ce490a58/1471-2105-14-2-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f9e/3621543/c4d192d73ca3/1471-2105-14-2-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f9e/3621543/6dcec7ced748/1471-2105-14-2-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f9e/3621543/7c4ce30e10c3/1471-2105-14-2-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f9e/3621543/8bd01099513b/1471-2105-14-2-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f9e/3621543/749df645eb69/1471-2105-14-2-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f9e/3621543/bca2bb3776f3/1471-2105-14-2-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f9e/3621543/d98e4ce818fb/1471-2105-14-2-9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f9e/3621543/de07720b48a0/1471-2105-14-2-10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f9e/3621543/ad763ca6003e/1471-2105-14-2-11.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f9e/3621543/f6d9f472fc2f/1471-2105-14-2-12.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f9e/3621543/9b19d07dc412/1471-2105-14-2-13.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f9e/3621543/873b02f1ac3f/1471-2105-14-2-14.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f9e/3621543/14d33e4aba3b/1471-2105-14-2-15.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f9e/3621543/2519c0357de9/1471-2105-14-2-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f9e/3621543/1bf4ce490a58/1471-2105-14-2-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f9e/3621543/c4d192d73ca3/1471-2105-14-2-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f9e/3621543/6dcec7ced748/1471-2105-14-2-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f9e/3621543/7c4ce30e10c3/1471-2105-14-2-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f9e/3621543/8bd01099513b/1471-2105-14-2-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f9e/3621543/749df645eb69/1471-2105-14-2-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f9e/3621543/bca2bb3776f3/1471-2105-14-2-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f9e/3621543/d98e4ce818fb/1471-2105-14-2-9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f9e/3621543/de07720b48a0/1471-2105-14-2-10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f9e/3621543/ad763ca6003e/1471-2105-14-2-11.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f9e/3621543/f6d9f472fc2f/1471-2105-14-2-12.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f9e/3621543/9b19d07dc412/1471-2105-14-2-13.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f9e/3621543/873b02f1ac3f/1471-2105-14-2-14.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f9e/3621543/14d33e4aba3b/1471-2105-14-2-15.jpg

相似文献

1
BioCause: Annotating and analysing causality in the biomedical domain.生物原因:生物医学领域的因果关系标注与分析。
BMC Bioinformatics. 2013 Jan 16;14:2. doi: 10.1186/1471-2105-14-2.
2
Enriching a biomedical event corpus with meta-knowledge annotation.用元知识标注丰富生物医学事件语料库。
BMC Bioinformatics. 2011 Oct 10;12:393. doi: 10.1186/1471-2105-12-393.
3
The biomedical discourse relation bank.生物医学话语关系库。
BMC Bioinformatics. 2011 May 23;12:188. doi: 10.1186/1471-2105-12-188.
4
Recognising discourse causality triggers in the biomedical domain.识别生物医学领域中的语篇因果关系触发因素。
J Bioinform Comput Biol. 2013 Dec;11(6):1343008. doi: 10.1142/S0219720013430087. Epub 2013 Dec 2.
5
Semi-supervised learning of causal relations in biomedical scientific discourse.生物医学科学话语中因果关系的半监督学习
Biomed Eng Online. 2014;13 Suppl 2(Suppl 2):S1. doi: 10.1186/1475-925X-13-S2-S1. Epub 2014 Dec 11.
6
Construction of an annotated corpus to support biomedical information extraction.构建带注释语料库以支持生物医学信息抽取。
BMC Bioinformatics. 2009 Oct 23;10:349. doi: 10.1186/1471-2105-10-349.
7
Active learning for ontological event extraction incorporating named entity recognition and unknown word handling.结合命名实体识别和未知词处理的本体事件抽取的主动学习
J Biomed Semantics. 2016 Apr 27;7:22. doi: 10.1186/s13326-016-0059-z. eCollection 2016.
8
BioCreative V CDR task corpus: a resource for chemical disease relation extraction.生物创意V化学疾病关系提取任务语料库:化学疾病关系提取的资源。
Database (Oxford). 2016 May 9;2016. doi: 10.1093/database/baw068. Print 2016.
9
The biomedical relationship corpus of the BioRED track at the BioCreative VIII challenge and workshop.生物创意 VIII 挑战赛和研讨会的 BioRED 专题生物医学关系语料库。
Database (Oxford). 2024 Aug 9;2024. doi: 10.1093/database/baae071.
10
A corpus of full-text journal articles is a robust evaluation tool for revealing differences in performance of biomedical natural language processing tools.语料库全文期刊文章是一种强大的评估工具,可用于揭示生物医学自然语言处理工具性能的差异。
BMC Bioinformatics. 2012 Aug 17;13:207. doi: 10.1186/1471-2105-13-207.

引用本文的文献

1
A survey on clinical natural language processing in the United Kingdom from 2007 to 2022.2007年至2022年英国临床自然语言处理调查。
NPJ Digit Med. 2022 Dec 21;5(1):186. doi: 10.1038/s41746-022-00730-6.
2
Identifying Cases of Shoulder Injury Related to Vaccine Administration (SIRVA) in the United States: Development and Validation of a Natural Language Processing Method.美国疫苗接种相关肩部损伤(SIRVA)病例的识别:自然语言处理方法的开发和验证。
JMIR Public Health Surveill. 2022 May 24;8(5):e30426. doi: 10.2196/30426.
3
MedTAG: a portable and customizable annotation tool for biomedical documents.

本文引用的文献

1
Extracting semantically enriched events from biomedical literature.从生物医学文献中提取语义丰富的事件。
BMC Bioinformatics. 2012 May 23;13:108. doi: 10.1186/1471-2105-13-108.
2
Boosting automatic event extraction from the literature using domain adaptation and coreference resolution.利用领域自适应和共指解析技术提高文献中自动事件抽取的性能。
Bioinformatics. 2012 Jul 1;28(13):1759-65. doi: 10.1093/bioinformatics/bts237. Epub 2012 Apr 25.
3
Argo: an integrative, interactive, text mining-based workbench supporting curation.Argonaut:一个集成的、交互的、基于文本挖掘的工作平台,支持管理。
MedTAG:一个用于生物医学文档的可移植和可定制的注释工具。
BMC Med Inform Decis Mak. 2021 Dec 18;21(1):352. doi: 10.1186/s12911-021-01706-4.
4
ProtFus: A Comprehensive Method Characterizing Protein-Protein Interactions of Fusion Proteins.ProtFus:一种全面的融合蛋白蛋白质相互作用特征分析方法。
PLoS Comput Biol. 2019 Aug 22;15(8):e1007239. doi: 10.1371/journal.pcbi.1007239. eCollection 2019 Aug.
5
Text Mining the History of Medicine.挖掘医学史
PLoS One. 2016 Jan 6;11(1):e0144717. doi: 10.1371/journal.pone.0144717. eCollection 2016.
6
Construction of phosphorylation interaction networks by text mining of full-length articles using the eFIP system.使用eFIP系统通过对全文进行文本挖掘构建磷酸化相互作用网络。
Database (Oxford). 2015 Mar 31;2015. doi: 10.1093/database/bav020. Print 2015.
7
Semi-supervised learning of causal relations in biomedical scientific discourse.生物医学科学话语中因果关系的半监督学习
Biomed Eng Online. 2014;13 Suppl 2(Suppl 2):S1. doi: 10.1186/1475-925X-13-S2-S1. Epub 2014 Dec 11.
Database (Oxford). 2012 Mar 20;2012:bas010. doi: 10.1093/database/bas010. Print 2012.
4
Software for systems biology: from tools to integrated platforms.系统生物学软件:从工具到集成平台。
Nat Rev Genet. 2011 Nov 3;12(12):821-32. doi: 10.1038/nrg3096.
5
Enriching a biomedical event corpus with meta-knowledge annotation.用元知识标注丰富生物医学事件语料库。
BMC Bioinformatics. 2011 Oct 10;12:393. doi: 10.1186/1471-2105-12-393.
6
Enhancing biomedical text summarization using semantic relation extraction.利用语义关系抽取技术增强生物医学文本摘要
PLoS One. 2011;6(8):e23862. doi: 10.1371/journal.pone.0023862. Epub 2011 Aug 26.
7
A review of causal inference for biomedical informatics.生物医学信息学因果推断研究综述。
J Biomed Inform. 2011 Dec;44(6):1102-12. doi: 10.1016/j.jbi.2011.07.001. Epub 2011 Jul 14.
8
Exploring subdomain variation in biomedical language.探索生物医学语言中的子域变化。
BMC Bioinformatics. 2011 May 27;12:212. doi: 10.1186/1471-2105-12-212.
9
The biomedical discourse relation bank.生物医学话语关系库。
BMC Bioinformatics. 2011 May 23;12:188. doi: 10.1186/1471-2105-12-188.
10
Anaphoric relations in the clinical narrative: corpus creation.临床叙述中的回指关系:语料库创建。
J Am Med Inform Assoc. 2011 Jul-Aug;18(4):459-65. doi: 10.1136/amiajnl-2011-000108. Epub 2011 Apr 1.