基于词向量到句向量级联方法的表型提取。

Phenotype Extraction Based on Word Embedding to Sentence Embedding Cascaded Approach.

出版信息

IEEE Trans Nanobioscience. 2018 Jul;17(3):172-180. doi: 10.1109/TNB.2018.2838137. Epub 2018 May 18.

DOI:10.1109/TNB.2018.2838137

Abstract

As a significant determinant in the development of named entity recognition, phenotypic descriptions are normally presented differently in biomedical literature with the use of complicated semantics. In this paper, a novel approach has been proposed to identify plant phenotypes by adopting word embedding to sentence embedding cascaded approach. We make use of a word embedding method to find high-frequency phenotypes with original sentences used as input in a sentence embedding method. In doing so, a variety of complicated phenotypic expressions can be recognized accurately. Besides, the state-of-the-art word representation models have been compared and among them, skip-gram with negative sampling was selected with the best performance. To evaluate the performance of our approach, we applied it to the dataset composed of 56 748 PubMed abstracts of model organism Arabidopsis thaliana. The experiment results showed that our approach yielded the best performance, as it achieved a 2.588-fold increase in terms of the number of new phenotypic descriptions when compared to the original phenotype ontology.

摘要

作为命名实体识别发展的重要决定因素，表型描述通常在生物医学文献中使用复杂的语义呈现不同的方式。在本文中，我们提出了一种新的方法，通过采用词嵌入到句子嵌入级联的方法来识别植物表型。我们利用词嵌入方法，通过将原始句子作为输入在句子嵌入方法中找到高频表型。通过这种方式，可以准确识别各种复杂的表型表达。此外，我们比较了最先进的词表示模型，其中选择了表现最好的 skip-gram with negative sampling。为了评估我们方法的性能，我们将其应用于由 56748 篇拟南芥 PubMed 摘要组成的数据集。实验结果表明，与原始表型本体相比，我们的方法在新表型描述的数量上取得了 2.588 倍的提高，表现最佳。

相似文献

Phenotype Extraction Based on Word Embedding to Sentence Embedding Cascaded Approach.基于词向量到句向量级联方法的表型提取。

IEEE Trans Nanobioscience. 2018 Jul;17(3):172-180. doi: 10.1109/TNB.2018.2838137. Epub 2018 May 18.

Neural sentence embedding models for semantic similarity estimation in the biomedical domain.生物医学领域中语义相似度估计的神经句子嵌入模型。

BMC Bioinformatics. 2019 Apr 11;20(1):178. doi: 10.1186/s12859-019-2789-2.

A gene-phenotype relationship extraction pipeline from the biomedical literature using a representation learning approach.使用表示学习方法从生物医学文献中提取基因-表型关系的管道。

Bioinformatics. 2018 Jul 1;34(13):i386-i394. doi: 10.1093/bioinformatics/bty263.

Embedding assisted prediction architecture for event trigger identification.用于事件触发识别的嵌入辅助预测架构。

J Bioinform Comput Biol. 2015 Jun;13(3):1541001. doi: 10.1142/S0219720015410012. Epub 2015 Jan 11.

A modular framework for biomedical concept recognition.生物医学概念识别的模块化框架。

BMC Bioinformatics. 2013 Sep 24;14:281. doi: 10.1186/1471-2105-14-281.

Word and Sentence Embedding Tools to Measure Semantic Similarity of Gene Ontology Terms by Their Definitions.通过基因本体术语的定义来测量其语义相似性的词和句子嵌入工具。

J Comput Biol. 2019 Jan;26(1):38-52. doi: 10.1089/cmb.2018.0093. Epub 2018 Oct 31.

Textpresso: an ontology-based information retrieval and extraction system for biological literature.Textpresso：一个基于本体的生物文献信息检索与提取系统。

PLoS Biol. 2004 Nov;2(11):e309. doi: 10.1371/journal.pbio.0020309. Epub 2004 Sep 21.

Fast and scalable neural embedding models for biomedical sentence classification.用于生物医学句子分类的快速可扩展神经嵌入模型。

BMC Bioinformatics. 2018 Dec 22;19(1):541. doi: 10.1186/s12859-018-2496-4.

Text mining and protein annotations: the construction and use of protein description sentences.文本挖掘与蛋白质注释：蛋白质描述语句的构建与应用

Genome Inform. 2006;17(2):121-30.

Word Spotting and Recognition with Embedded Attributes.基于嵌入式属性的字词定位与识别。

IEEE Trans Pattern Anal Mach Intell. 2014 Dec;36(12):2552-66. doi: 10.1109/TPAMI.2014.2339814.

引用本文的文献

Model-Based Reasoning of Clinical Diagnosis in Integrative Medicine: Real-World Methodological Study of Electronic Medical Records and Natural Language Processing Methods.中西医结合临床诊断的基于模型的推理：电子病历与自然语言处理方法的真实世界方法学研究

JMIR Med Inform. 2020 Dec 21;8(12):e23082. doi: 10.2196/23082.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于词向量到句向量级联方法的表型提取。

Phenotype Extraction Based on Word Embedding to Sentence Embedding Cascaded Approach.

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献