Crasto Chiquito J, Morse Thomas M, Migliore Michele, Nadkarni Prakash, Hines Michael, Brash Douglas E, Miller Perry L, Shepherd Gordon M
Center for Medical Informatics, Yale University, New Haven, Connecticut, USA.
AMIA Annu Symp Proc. 2003;2003:821.
Knowledgebase-mediated text-mining approaches work best when processing the natural language of domain-specific text. To enhance the utility of our successfully tested program-NeuroText, and to extend its methodologies to other domains, we have designed clustering algorithms, which is the principal step in automatically creating a knowledgebase. Our algorithms are designed to improve the quality of clustering by parsing the test corpus to include semantic and syntactic parsing
基于知识库的文本挖掘方法在处理特定领域文本的自然语言时效果最佳。为了提高我们成功测试的程序——NeuroText的实用性,并将其方法扩展到其他领域,我们设计了聚类算法,这是自动创建知识库的主要步骤。我们的算法旨在通过对测试语料库进行解析,包括语义和句法解析,来提高聚类质量。