GOClonto：一种用于概念化 PubMed 摘要的本体聚类方法。

GOClonto: an ontological clustering approach for conceptualizing PubMed abstracts.

机构信息

Biomedical Knowledge Engineering Laboratory, BK21 College of Dentistry, Seoul National University, 28 Yeongeon-dong, Jongro-gu, Seoul 110-749, Republic of Korea.

出版信息

J Biomed Inform. 2010 Feb;43(1):31-40. doi: 10.1016/j.jbi.2009.07.006. Epub 2009 Jul 25.

DOI:10.1016/j.jbi.2009.07.006

PMID:19635585

Abstract

Concurrent with progress in biomedical sciences, an overwhelming of textual knowledge is accumulating in the biomedical literature. PubMed is the most comprehensive database collecting and managing biomedical literature. To help researchers easily understand collections of PubMed abstracts, numerous clustering methods have been proposed to group similar abstracts based on their shared features. However, most of these methods do not explore the semantic relationships among groupings of documents, which could help better illuminate the groupings of PubMed abstracts. To address this issue, we proposed an ontological clustering method called GOClonto for conceptualizing PubMed abstracts. GOClonto uses latent semantic analysis (LSA) and gene ontology (GO) to identify key gene-related concepts and their relationships as well as allocate PubMed abstracts based on these key gene-related concepts. Based on two PubMed abstract collections, the experimental results show that GOClonto is able to identify key gene-related concepts and outperforms the STC (suffix tree clustering) algorithm, the Lingo algorithm, the Fuzzy Ants algorithm, and the clustering based TRS (tolerance rough set) algorithm. Moreover, the two ontologies generated by GOClonto show significant informative conceptual structures.

摘要

随着生物医学科学的发展，生物医学文献中的文本知识呈爆炸式增长。PubMed 是收集和管理生物医学文献的最全面的数据库。为了帮助研究人员轻松理解 PubMed 摘要集，已经提出了许多聚类方法，这些方法根据其共享的特征对相似的摘要进行分组。然而，这些方法大多没有探索文档分组之间的语义关系，而这些关系可以帮助更好地阐明 PubMed 摘要的分组。为了解决这个问题，我们提出了一种名为 GOClonto 的基于本体的聚类方法，用于对 PubMed 摘要进行概念化。GOClonto 使用潜在语义分析 (LSA) 和基因本体 (GO) 来识别关键基因相关概念及其关系，并根据这些关键基因相关概念对 PubMed 摘要进行分配。基于两个 PubMed 摘要集，实验结果表明，GOClonto 能够识别关键基因相关概念，并优于后缀树聚类 (STC) 算法、Lingo 算法、模糊蚂蚁算法和基于 TRS（容忍粗糙集）算法的聚类。此外，GOClonto 生成的两个本体显示出具有显著信息的概念结构。

相似文献

GOClonto: an ontological clustering approach for conceptualizing PubMed abstracts.GOClonto：一种用于概念化 PubMed 摘要的本体聚类方法。

J Biomed Inform. 2010 Feb;43(1):31-40. doi: 10.1016/j.jbi.2009.07.006. Epub 2009 Jul 25.

A knowledge-driven approach to biomedical document conceptualization.基于知识的生物医学文献概念化方法。

Artif Intell Med. 2010 Jun;49(2):67-78. doi: 10.1016/j.artmed.2010.02.005. Epub 2010 Apr 3.

PuReD-MCL: a graph-based PubMed document clustering methodology.PuReD-MCL：一种基于图的PubMed文档聚类方法。

Bioinformatics. 2008 Sep 1;24(17):1935-41. doi: 10.1093/bioinformatics/btn318. Epub 2008 Jul 1.

Biomedical knowledge navigation by literature clustering.通过文献聚类进行生物医学知识导航。

J Biomed Inform. 2007 Apr;40(2):114-30. doi: 10.1016/j.jbi.2006.07.004. Epub 2006 Aug 5.

Recognizing names in biomedical texts: a machine learning approach.识别生物医学文本中的名称：一种机器学习方法。

Bioinformatics. 2004 May 1;20(7):1178-90. doi: 10.1093/bioinformatics/bth060. Epub 2004 Feb 10.

Concept-based annotation of enzyme classes.基于概念的酶类注释。

Bioinformatics. 2005 May 1;21(9):2059-66. doi: 10.1093/bioinformatics/bti284. Epub 2005 Jan 20.

Literature-based concept profiles for gene annotation: the issue of weighting.基于文献的基因注释概念概况：加权问题。

Int J Med Inform. 2008 May;77(5):354-62. doi: 10.1016/j.ijmedinf.2007.07.004. Epub 2007 Sep 10.

Co-occurrence based meta-analysis of scientific texts: retrieving biological relationships between genes.基于共现的科学文本元分析：检索基因间的生物学关系

Bioinformatics. 2005 May 1;21(9):2049-58. doi: 10.1093/bioinformatics/bti268. Epub 2005 Jan 18.

Using Greedy algorithm: DBSCAN revisited II.使用贪心算法：重新审视DBSCAN II。

J Zhejiang Univ Sci. 2004 Nov;5(11):1405-12. doi: 10.1631/jzus.2004.1405.

Automatic extension of Gene Ontology with flexible identification of candidate terms.通过灵活识别候选术语自动扩展基因本体论

Bioinformatics. 2006 Mar 15;22(6):665-70. doi: 10.1093/bioinformatics/btl010. Epub 2006 Jan 21.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

GOClonto：一种用于概念化 PubMed 摘要的本体聚类方法。

GOClonto: an ontological clustering approach for conceptualizing PubMed abstracts.

机构信息

出版信息

相似文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献