• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

面向文本语料概念化的概念驱动生物医学知识提取和可视化框架。

A concept-driven biomedical knowledge extraction and visualization framework for conceptualization of text corpora.

机构信息

Department of Computer Science, Jamia Millia Islamia (A Central University), New Delhi, India.

出版信息

J Biomed Inform. 2010 Dec;43(6):1020-35. doi: 10.1016/j.jbi.2010.09.008. Epub 2010 Sep 24.

DOI:10.1016/j.jbi.2010.09.008
PMID:20870033
Abstract

A number of techniques such as information extraction, document classification, document clustering and information visualization have been developed to ease extraction and understanding of information embedded within text documents. However, knowledge that is embedded in natural language texts is difficult to extract using simple pattern matching techniques and most of these methods do not help users directly understand key concepts and their semantic relationships in document corpora, which are critical for capturing their conceptual structures. The problem arises due to the fact that most of the information is embedded within unstructured or semi-structured texts that computers can not interpret very easily. In this paper, we have presented a novel Biomedical Knowledge Extraction and Visualization framework, BioKEVis to identify key information components from biomedical text documents. The information components are centered on key concepts. BioKEVis applies linguistic analysis and Latent Semantic Analysis (LSA) to identify key concepts. The information component extraction principle is based on natural language processing techniques and semantic-based analysis. The system is also integrated with a biomedical named entity recognizer, ABNER, to tag genes, proteins and other entity names in the text. We have also presented a method for collating information extracted from multiple sources to generate semantic network. The network provides distinct user perspectives and allows navigation over documents with similar information components and is also used to provide a comprehensive view of the collection. The system stores the extracted information components in a structured repository which is integrated with a query-processing module to handle biomedical queries over text documents. We have also proposed a document ranking mechanism to present retrieved documents in order of their relevance to the user query.

摘要

已经开发了许多技术,例如信息提取、文档分类、文档聚类和信息可视化,以简化对文本文件中嵌入的信息的提取和理解。然而,嵌入在自然语言文本中的知识很难使用简单的模式匹配技术提取,并且这些方法中的大多数都无法帮助用户直接理解文档语料库中的关键概念及其语义关系,而这对于捕获概念结构至关重要。这个问题的出现是因为大多数信息都嵌入在计算机不易解释的非结构化或半结构化文本中。在本文中,我们提出了一种新颖的生物医学知识提取和可视化框架 BioKEVis,用于从生物医学文本文档中识别关键信息组件。信息组件以关键概念为中心。BioKEVis 应用语言分析和潜在语义分析 (LSA) 来识别关键概念。信息组件提取原则基于自然语言处理技术和基于语义的分析。该系统还与生物医学命名实体识别器 ABNER 集成,以标记文本中的基因、蛋白质和其他实体名称。我们还提出了一种从多个来源整理信息以生成语义网络的方法。该网络提供了不同的用户视角,并允许在具有相似信息组件的文档上进行导航,还用于提供集合的全面视图。该系统将提取的信息组件存储在一个结构化存储库中,该存储库与查询处理模块集成,以处理对文本文档的生物医学查询。我们还提出了一种文档排名机制,以便根据与用户查询的相关性对检索到的文档进行排序。

相似文献

1
A concept-driven biomedical knowledge extraction and visualization framework for conceptualization of text corpora.面向文本语料概念化的概念驱动生物医学知识提取和可视化框架。
J Biomed Inform. 2010 Dec;43(6):1020-35. doi: 10.1016/j.jbi.2010.09.008. Epub 2010 Sep 24.
2
The interaction of domain knowledge and linguistic structure in natural language processing: interpreting hypernymic propositions in biomedical text.自然语言处理中领域知识与语言结构的相互作用:解读生物医学文本中的上位命题
J Biomed Inform. 2003 Dec;36(6):462-77. doi: 10.1016/j.jbi.2003.11.003.
3
A knowledge-driven approach to biomedical document conceptualization.基于知识的生物医学文献概念化方法。
Artif Intell Med. 2010 Jun;49(2):67-78. doi: 10.1016/j.artmed.2010.02.005. Epub 2010 Apr 3.
4
Recognizing names in biomedical texts: a machine learning approach.识别生物医学文本中的名称:一种机器学习方法。
Bioinformatics. 2004 May 1;20(7):1178-90. doi: 10.1093/bioinformatics/bth060. Epub 2004 Feb 10.
5
Statistical search on the Semantic Web.语义网上的统计搜索。
Bioinformatics. 2008 Apr 1;24(7):1002-10. doi: 10.1093/bioinformatics/btn054. Epub 2008 Feb 8.
6
A hybrid method for relation extraction from biomedical literature.一种从生物医学文献中提取关系的混合方法。
Int J Med Inform. 2006 Jun;75(6):443-55. doi: 10.1016/j.ijmedinf.2005.06.010. Epub 2005 Aug 10.
7
UMLS knowledge for biomedical language processing.用于生物医学语言处理的统一医学语言系统知识。
Bull Med Libr Assoc. 1993 Apr;81(2):184-94.
8
Status of text-mining techniques applied to biomedical text.应用于生物医学文本的文本挖掘技术现状。
Drug Discov Today. 2006 Apr;11(7-8):315-25. doi: 10.1016/j.drudis.2006.02.011.
9
Enhanced information retrieval from narrative German-language clinical text documents using automated document classification.使用自动文档分类从德语叙述性临床文本文件中增强信息检索。
Stud Health Technol Inform. 2008;136:473-8.
10
The semantic pathfinder: using an authoring metaphor for generic multimedia indexing.语义路径查找器:使用创作隐喻进行通用多媒体索引编制。
IEEE Trans Pattern Anal Mach Intell. 2006 Oct;28(10):1678-89. doi: 10.1109/TPAMI.2006.212.

引用本文的文献

1
Using Big Data to Evaluate the Association between Periodontal Disease and Rheumatoid Arthritis.利用大数据评估牙周病与类风湿性关节炎之间的关联。
AMIA Annu Symp Proc. 2015 Nov 5;2015:589-93. eCollection 2015.
2
Textrous!: extracting semantic textual meaning from gene sets.Textrous!:从基因集中提取语义文本意义。
PLoS One. 2013 Apr 30;8(4):e62665. doi: 10.1371/journal.pone.0062665. Print 2013.
3
Effective use of latent semantic indexing and computational linguistics in biological and biomedical applications.有效利用潜在语义索引和计算语言学在生物和生物医学应用中的应用。
Front Physiol. 2013 Jan 30;4:8. doi: 10.3389/fphys.2013.00008. eCollection 2013.