Suppr超能文献

生物实体共指消解中相同提及的出色表示。

Distinguished representation of identical mentions in bio-entity coreference resolution.

机构信息

School of Computer Science and Technology, Xi'an Jiaotong University, Xi'an, 710049, Shaanxi, China.

National Engineering Lab for Big Data Analytics, Xi'an Jiaotong University, Xi'an, 710049, Shaanxi, China.

出版信息

BMC Med Inform Decis Mak. 2022 Apr 30;22(1):116. doi: 10.1186/s12911-022-01862-1.

Abstract

BACKGROUND

Bio-entity Coreference Resolution (CR) is a vital task in biomedical text mining. An important issue in CR is the differential representation of identical mentions as their similar representations may make the coreference more puzzling. However, when extracting features, existing neural network-based models may bring additional noise to the distinction of identical mentions since they tend to get similar or even identical feature representations.

METHODS

We propose a context-aware feature attention model to distinguish similar or identical text units effectively for better resolving coreference. The new model can represent the identical mentions based on different contexts by adaptively exploiting features, which enables the model reduce the text noise and capture the semantic information effectively.

RESULTS

The experimental results show that the proposed model brings significant improvements on most of the baseline for coreference resolution and mention detection on the BioNLP dataset and CRAFT-CR dataset. The empirical studies further demonstrate its superior performance on the differential representation and coreferential link of identical mentions.

CONCLUSIONS

Identical mentions impose difficulties on the current methods of Bio-entity coreference resolution. Thus, we propose the context-aware feature attention model to better distinguish identical mentions and achieve superior performance on both coreference resolution and mention detection, which will further improve the performance of the downstream tasks.

摘要

背景

生物实体共指消解(CR)是生物医学文本挖掘中的一项重要任务。在 CR 中,一个重要问题是相同提及的不同表示,因为它们的相似表示可能会使共指更加复杂。然而,在提取特征时,现有的基于神经网络的模型可能会给相同提及的区分带来额外的噪声,因为它们往往会得到相似甚至相同的特征表示。

方法

我们提出了一种上下文感知特征注意力模型,以有效地区分相似或相同的文本单元,从而更好地解决共指问题。该新模型可以基于不同的上下文自适应地表示相同的提及,从而使模型能够减少文本噪声并有效地捕获语义信息。

结果

实验结果表明,该模型在 BioNLP 数据集和 CRAFT-CR 数据集上的大多数基线的共指消解和提及检测方面都有显著的改进。实证研究进一步证明了其在相同提及的差异化表示和共指链接方面的优越性能。

结论

相同提及给当前的生物实体共指消解方法带来了困难。因此,我们提出了上下文感知特征注意力模型,以更好地区分相同提及,并在共指消解和提及检测方面取得优异的性能,从而进一步提高下游任务的性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d9cc/9063119/6fb2fd9bd68c/12911_2022_1862_Fig1_HTML.jpg

相似文献

1
Distinguished representation of identical mentions in bio-entity coreference resolution.
BMC Med Inform Decis Mak. 2022 Apr 30;22(1):116. doi: 10.1186/s12911-022-01862-1.
2
Bio-SCoRes: A Smorgasbord Architecture for Coreference Resolution in Biomedical Text.
PLoS One. 2016 Mar 2;11(3):e0148538. doi: 10.1371/journal.pone.0148538. eCollection 2016.
3
Integrating K+ Entities Into Coreference Resolution on Biomedical Texts.
IEEE/ACM Trans Comput Biol Bioinform. 2024 Nov-Dec;21(6):2145-2155. doi: 10.1109/TCBB.2024.3447273. Epub 2024 Dec 10.
4
Using domain knowledge and domain-inspired discourse model for coreference resolution for clinical narratives.
J Am Med Inform Assoc. 2013 Mar-Apr;20(2):356-62. doi: 10.1136/amiajnl-2011-000767. Epub 2012 Jul 10.
6
Knowledge enhanced LSTM for coreference resolution on biomedical texts.
Bioinformatics. 2021 Sep 9;37(17):2699-2705. doi: 10.1093/bioinformatics/btab153.
7
Towards generalizable entity-centric clinical coreference resolution.
J Biomed Inform. 2017 May;69:251-258. doi: 10.1016/j.jbi.2017.04.015. Epub 2017 Apr 21.
8
A supervised framework for resolving coreference in clinical records.
J Am Med Inform Assoc. 2012 Sep-Oct;19(5):875-82. doi: 10.1136/amiajnl-2012-000810. Epub 2012 May 19.
9
The contribution of co-reference resolution to supervised relation detection between bacteria and biotopes entities.
BMC Bioinformatics. 2015;16 Suppl 10(Suppl 10):S6. doi: 10.1186/1471-2105-16-S10-S6. Epub 2015 Jul 13.
10
EUSKOR: End-to-end coreference resolution system for Basque.
PLoS One. 2019 Sep 12;14(9):e0221801. doi: 10.1371/journal.pone.0221801. eCollection 2019.

本文引用的文献

1
Knowledge enhanced LSTM for coreference resolution on biomedical texts.
Bioinformatics. 2021 Sep 9;37(17):2699-2705. doi: 10.1093/bioinformatics/btab153.
2
A set of domain rules and a deep network for protein coreference resolution.
Database (Oxford). 2018 Jan 1;2018. doi: 10.1093/database/bay065.
3
Bio-SCoRes: A Smorgasbord Architecture for Coreference Resolution in Biomedical Text.
PLoS One. 2016 Mar 2;11(3):e0148538. doi: 10.1371/journal.pone.0148538. eCollection 2016.
4
A categorical analysis of coreference resolution errors in biomedical texts.
J Biomed Inform. 2016 Apr;60:309-18. doi: 10.1016/j.jbi.2016.02.015. Epub 2016 Feb 27.
5
Long short-term memory.
Neural Comput. 1997 Nov 15;9(8):1735-80. doi: 10.1162/neco.1997.9.8.1735.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验