Knowledge Management in Bioinformatics, Institute for Computer Science, Humboldt-Universität zu Berlin, Unter den Linden 6, 10099 Berlin, Germany.
Nucleic Acids Res. 2012 Jul;40(Web Server issue):W585-91. doi: 10.1093/nar/gks563. Epub 2012 Jun 12.
Research results are primarily published in scientific literature and curation efforts cannot keep up with the rapid growth of published literature. The plethora of knowledge remains hidden in large text repositories like MEDLINE. Consequently, life scientists have to spend a great amount of time searching for specific information. The enormous ambiguity among most names of biomedical objects such as genes, chemicals and diseases often produces too large and unspecific search results. We present GeneView, a semantic search engine for biomedical knowledge. GeneView is built upon a comprehensively annotated version of PubMed abstracts and openly available PubMed Central full texts. This semi-structured representation of biomedical texts enables a number of features extending classical search engines. For instance, users may search for entities using unique database identifiers or they may rank documents by the number of specific mentions they contain. Annotation is performed by a multitude of state-of-the-art text-mining tools for recognizing mentions from 10 entity classes and for identifying protein-protein interactions. GeneView currently contains annotations for >194 million entities from 10 classes for ∼21 million citations with 271,000 full text bodies. GeneView can be searched at http://bc3.informatik.hu-berlin.de/.
研究成果主要发表在科学文献中,而文献整理工作无法跟上已发表文献的快速增长。大量的知识仍然隐藏在 MEDLINE 等大型文本存储库中。因此,生命科学家不得不花费大量时间搜索特定信息。大多数生物医学对象(如基因、化学物质和疾病)的名称歧义很大,这往往会产生过大和不具体的搜索结果。我们提出了 GeneView,这是一种用于生物医学知识的语义搜索引擎。GeneView 建立在经过全面注释的 PubMed 摘要和公开可用的 PubMed Central 全文版本之上。这种生物医学文本的半结构化表示形式支持许多扩展经典搜索引擎的功能。例如,用户可以使用唯一的数据库标识符搜索实体,或者可以根据包含的特定提及数量对文档进行排序。注释是通过多种最先进的文本挖掘工具完成的,用于识别来自 10 个实体类别的提及,并识别蛋白质-蛋白质相互作用。目前,GeneView 包含来自 10 个类别的超过 1.94 亿个实体的注释,涵盖了约 2100 万篇参考文献,其中包含 271,000 篇全文。GeneView 可在 http://bc3.informatik.hu-berlin.de/ 上进行搜索。