Fontaine Jean-Fred, Barbosa-Silva Adriano, Schaefer Martin, Huska Matthew R, Muro Enrique M, Andrade-Navarro Miguel A
Computational Biology and Data Mining Group, Max Delbrück Center for Molecular Medicine, Robert-Rössle-Strasse. 10, D-13125, Berlin, Germany.
Nucleic Acids Res. 2009 Jul;37(Web Server issue):W141-6. doi: 10.1093/nar/gkp353. Epub 2009 May 8.
The biomedical literature is represented by millions of abstracts available in the Medline database. These abstracts can be queried with the PubMed interface, which provides a keyword-based Boolean search engine. This approach shows limitations in the retrieval of abstracts related to very specific topics, as it is difficult for a non-expert user to find all of the most relevant keywords related to a biomedical topic. Additionally, when searching for more general topics, the same approach may return hundreds of unranked references. To address these issues, text mining tools have been developed to help scientists focus on relevant abstracts. We have implemented the MedlineRanker webserver, which allows a flexible ranking of Medline for a topic of interest without expert knowledge. Given some abstracts related to a topic, the program deduces automatically the most discriminative words in comparison to a random selection. These words are used to score other abstracts, including those from not yet annotated recent publications, which can be then ranked by relevance. We show that our tool can be highly accurate and that it is able to process millions of abstracts in a practical amount of time. MedlineRanker is free for use and is available at http://cbdm.mdc-berlin.de/tools/medlineranker.
生物医学文献由Medline数据库中数百万篇摘要构成。这些摘要可通过PubMed界面进行查询,该界面提供了一个基于关键词的布尔搜索引擎。这种方法在检索与非常特定主题相关的摘要时存在局限性,因为非专业用户很难找到与生物医学主题相关的所有最相关关键词。此外,在搜索更一般的主题时,同样的方法可能会返回数百条未排序的参考文献。为了解决这些问题,已经开发了文本挖掘工具来帮助科学家关注相关摘要。我们已经实现了MedlineRanker网络服务器,它允许在无需专业知识的情况下,对感兴趣的主题灵活地对Medline进行排名。给定一些与某个主题相关的摘要,该程序会与随机选择的摘要相比,自动推断出最具区分性的词。这些词用于对其他摘要进行评分,包括那些来自尚未注释的近期出版物的摘要,然后可以按相关性进行排名。我们表明,我们的工具可以非常准确,并且能够在实际的时间内处理数百万篇摘要。MedlineRanker免费使用,可在http://cbdm.mdc-berlin.de/tools/medlineranker获取。