• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于词向量到句向量级联方法的表型提取。

Phenotype Extraction Based on Word Embedding to Sentence Embedding Cascaded Approach.

出版信息

IEEE Trans Nanobioscience. 2018 Jul;17(3):172-180. doi: 10.1109/TNB.2018.2838137. Epub 2018 May 18.

DOI:10.1109/TNB.2018.2838137
PMID:29994536
Abstract

As a significant determinant in the development of named entity recognition, phenotypic descriptions are normally presented differently in biomedical literature with the use of complicated semantics. In this paper, a novel approach has been proposed to identify plant phenotypes by adopting word embedding to sentence embedding cascaded approach. We make use of a word embedding method to find high-frequency phenotypes with original sentences used as input in a sentence embedding method. In doing so, a variety of complicated phenotypic expressions can be recognized accurately. Besides, the state-of-the-art word representation models have been compared and among them, skip-gram with negative sampling was selected with the best performance. To evaluate the performance of our approach, we applied it to the dataset composed of 56 748 PubMed abstracts of model organism Arabidopsis thaliana. The experiment results showed that our approach yielded the best performance, as it achieved a 2.588-fold increase in terms of the number of new phenotypic descriptions when compared to the original phenotype ontology.

摘要

作为命名实体识别发展的重要决定因素,表型描述通常在生物医学文献中使用复杂的语义呈现不同的方式。在本文中,我们提出了一种新的方法,通过采用词嵌入到句子嵌入级联的方法来识别植物表型。我们利用词嵌入方法,通过将原始句子作为输入在句子嵌入方法中找到高频表型。通过这种方式,可以准确识别各种复杂的表型表达。此外,我们比较了最先进的词表示模型,其中选择了表现最好的 skip-gram with negative sampling。为了评估我们方法的性能,我们将其应用于由 56748 篇拟南芥 PubMed 摘要组成的数据集。实验结果表明,与原始表型本体相比,我们的方法在新表型描述的数量上取得了 2.588 倍的提高,表现最佳。

相似文献

1
Phenotype Extraction Based on Word Embedding to Sentence Embedding Cascaded Approach.基于词向量到句向量级联方法的表型提取。
IEEE Trans Nanobioscience. 2018 Jul;17(3):172-180. doi: 10.1109/TNB.2018.2838137. Epub 2018 May 18.
2
Neural sentence embedding models for semantic similarity estimation in the biomedical domain.生物医学领域中语义相似度估计的神经句子嵌入模型。
BMC Bioinformatics. 2019 Apr 11;20(1):178. doi: 10.1186/s12859-019-2789-2.
3
A gene-phenotype relationship extraction pipeline from the biomedical literature using a representation learning approach.使用表示学习方法从生物医学文献中提取基因-表型关系的管道。
Bioinformatics. 2018 Jul 1;34(13):i386-i394. doi: 10.1093/bioinformatics/bty263.
4
Embedding assisted prediction architecture for event trigger identification.用于事件触发识别的嵌入辅助预测架构。
J Bioinform Comput Biol. 2015 Jun;13(3):1541001. doi: 10.1142/S0219720015410012. Epub 2015 Jan 11.
5
A modular framework for biomedical concept recognition.生物医学概念识别的模块化框架。
BMC Bioinformatics. 2013 Sep 24;14:281. doi: 10.1186/1471-2105-14-281.
6
Word and Sentence Embedding Tools to Measure Semantic Similarity of Gene Ontology Terms by Their Definitions.通过基因本体术语的定义来测量其语义相似性的词和句子嵌入工具。
J Comput Biol. 2019 Jan;26(1):38-52. doi: 10.1089/cmb.2018.0093. Epub 2018 Oct 31.
7
Textpresso: an ontology-based information retrieval and extraction system for biological literature.Textpresso:一个基于本体的生物文献信息检索与提取系统。
PLoS Biol. 2004 Nov;2(11):e309. doi: 10.1371/journal.pbio.0020309. Epub 2004 Sep 21.
8
Fast and scalable neural embedding models for biomedical sentence classification.用于生物医学句子分类的快速可扩展神经嵌入模型。
BMC Bioinformatics. 2018 Dec 22;19(1):541. doi: 10.1186/s12859-018-2496-4.
9
Text mining and protein annotations: the construction and use of protein description sentences.文本挖掘与蛋白质注释:蛋白质描述语句的构建与应用
Genome Inform. 2006;17(2):121-30.
10
Word Spotting and Recognition with Embedded Attributes.基于嵌入式属性的字词定位与识别。
IEEE Trans Pattern Anal Mach Intell. 2014 Dec;36(12):2552-66. doi: 10.1109/TPAMI.2014.2339814.

引用本文的文献

1
Model-Based Reasoning of Clinical Diagnosis in Integrative Medicine: Real-World Methodological Study of Electronic Medical Records and Natural Language Processing Methods.中西医结合临床诊断的基于模型的推理:电子病历与自然语言处理方法的真实世界方法学研究
JMIR Med Inform. 2020 Dec 21;8(12):e23082. doi: 10.2196/23082.