利用聚类技术创建知识库以对PubMed文章进行文本挖掘。

Creating knowledgebases to text-mine PUBMED articles using clustering techniques.

作者信息

Crasto Chiquito J, Morse Thomas M, Migliore Michele, Nadkarni Prakash, Hines Michael, Brash Douglas E, Miller Perry L, Shepherd Gordon M

机构信息

Center for Medical Informatics, Yale University, New Haven, Connecticut, USA.

出版信息

AMIA Annu Symp Proc. 2003;2003:821.

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1479923/

Abstract

Knowledgebase-mediated text-mining approaches work best when processing the natural language of domain-specific text. To enhance the utility of our successfully tested program-NeuroText, and to extend its methodologies to other domains, we have designed clustering algorithms, which is the principal step in automatically creating a knowledgebase. Our algorithms are designed to improve the quality of clustering by parsing the test corpus to include semantic and syntactic parsing

摘要

基于知识库的文本挖掘方法在处理特定领域文本的自然语言时效果最佳。为了提高我们成功测试的程序——NeuroText的实用性，并将其方法扩展到其他领域，我们设计了聚类算法，这是自动创建知识库的主要步骤。我们的算法旨在通过对测试语料库进行解析，包括语义和句法解析，来提高聚类质量。

相似文献

1

Creating knowledgebases to text-mine PUBMED articles using clustering techniques.利用聚类技术创建知识库以对PubMed文章进行文本挖掘。

AMIA Annu Symp Proc. 2003;2003:821.

2

Text mining neuroscience journal articles to populate neuroscience databases.挖掘神经科学期刊文章以填充神经科学数据库。

Neuroinformatics. 2003;1(3):215-37. doi: 10.1385/NI:1:3:215.

3

Managing knowledge in neuroscience.管理神经科学领域的知识。

Methods Mol Biol. 2007;401:3-21. doi: 10.1007/978-1-59745-520-6_1.

4

The interaction of domain knowledge and linguistic structure in natural language processing: interpreting hypernymic propositions in biomedical text.自然语言处理中领域知识与语言结构的相互作用：解读生物医学文本中的上位命题

J Biomed Inform. 2003 Dec;36(6):462-77. doi: 10.1016/j.jbi.2003.11.003.

5

Extracting drug-drug interaction articles from MEDLINE to improve the content of drug databases.从医学文献数据库（MEDLINE）中提取药物相互作用文章以改善药物数据库的内容。

AMIA Annu Symp Proc. 2005;2005:216-20.

6

An environment for relation mining over richly annotated corpora: the case of GENIA.一个用于在大量注释语料库上进行关系挖掘的环境：以GENIA语料库为例。

BMC Bioinformatics. 2006 Nov 24;7 Suppl 3(Suppl 3):S3. doi: 10.1186/1471-2105-7-S3-S3.

7

PuReD-MCL: a graph-based PubMed document clustering methodology.PuReD-MCL：一种基于图的PubMed文档聚类方法。

Bioinformatics. 2008 Sep 1;24(17):1935-41. doi: 10.1093/bioinformatics/btn318. Epub 2008 Jul 1.

8

Discovering novel causal patterns from biomedical natural-language texts using Bayesian nets.使用贝叶斯网络从生物医学自然语言文本中发现新的因果模式。

IEEE Trans Inf Technol Biomed. 2008 Nov;12(6):714-22. doi: 10.1109/TITB.2008.920793.

9

Corpus annotation for mining biomedical events from literature.用于从文献中挖掘生物医学事件的语料库标注。

BMC Bioinformatics. 2008 Jan 8;9:10. doi: 10.1186/1471-2105-9-10.

10

Block-suffix shifting: fast, simultaneous medical concept set identification in large medical record corpora.块后缀移位：在大型医疗记录语料库中快速、同步识别医学概念集

AMIA Annu Symp Proc. 2008 Nov 6;2008:424-8.

本文引用的文献

1

Text mining neuroscience journal articles to populate neuroscience databases.挖掘神经科学期刊文章以填充神经科学数据库。

Neuroinformatics. 2003;1(3):215-37. doi: 10.1385/NI:1:3:215.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验